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(54) Method and apparatus for special video reproduction modes 



(57) A special reproduction control Information com- 
prises plurality of items (1 00) of frame Information. Each 
of the items of frame information comprises video loca- 



tion information (1 01 ) indicating the location of video da- 
ta to be reproduced in a special reproduction and display 
time control information (102) indicating the time for dis- 
playing the video data. 
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Description 

[0001] The present Invention relates to a special re- 
production control inlormation describing method for de- 
scribing special reproduction control information used 
to perform special reproduction for target video con- 
tents, a special reproduction control information creat- 
ing method for creating the special reproduction control 
information and a special reproduction control informa- 
tion creating apparatus and a video reproduction appa- 
ratus and method for performing special reproduction 
by using the special reproduction control information. 
[0002] In recent yoars, a motion picture is com- 
pressed as a digital video and is stored in disk media 
represented by a DVD, and a HDD so that a video can 
be reproduced at. random. A video can be reproduced 
halfway from a desired timing in the state of virtually no 
waiting time. As In conventional tape media, disk media 
can be last reproduced at Lwo Lo four limes speed or can 
be reversely reproduced. 

[0003] However, there is a probiem in that the length 
of a video can be very long in many cases, and time 
cannot be sufficiently compressed to view the whole 
contents of the video even at two to four times fast re- 
production. When the rate of me fast reproduction is in- 
creased, the scene change is enlarged to a degree ex- 
ceeding the ability to view it, so thai grasping the con- 
tents is difficult, and even portions which are not needed 
are also reproduced so that waste is caused. 
[0004] Accordingly, the present invention is directed 
to method and apparatus that substantially obviatesone 
or more of the problems due to limitations and disad- 
vantages of the related art. 

[0005] According to one aspect of the present inven- 
tion, a method of describing frame information compris- 
es: 
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describing, for a frame extracted from a plurality of 
frames in a source video data, first information 
specifying a location of the extracted frame in the 
source video data; and 

describing, forthe extracted frame, second informa- 
tion relating to a display time of the extracted frame. 

[000S] According to another aspect of the present In- 
vention, an article of manufacture comprising a compu- 
ter usable medium storing frame information, the frame 
Information comprises: 

first information, described for a frame extracted 
from a plurality of frames, specifying a location of 
^the^xtractedJr^raeJnJhe-sourcejyjdeo^ata^and^ 
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— ^s^ebnd IrTbfmatTonT "described loT'thT extracted"" 
frame, relating to a display time of the extracted 
frame. 55 

[0007] According to another aspect of the present in- 
vention, an apparatus for creating frame information 



comprises: 

a unit configured to extract a frame from a plurality 
of frames in a source video data; 
a unit configured to create the Frame information in- 
cluding first information specifying a location of the 
extracted frame and second information relating to 
a display time of the extracted frame; and 
a unit configured to link the extracted frame to the 
frame information. 

[0008] According lo another aspect of the present in- 
vention, a method of creating frame information com- 
prises: 

extracting a frame from a plurality of frames in a 
source video data; and 

creating the frame Information including First Infor- 
mation specifying a location of the extracted Frame 
in the source video data and second information re- 
lating to a display time of the extracted frame. 

[0009] According to another aspect of the present in- 
vention, an apparatus for performing a special reproduc- 
tion comprises: 

.a unit configured to refer to frame information de- 
scribed for a frame extracted from a plurality of 
frames in a source video data and including first in- 
formation specifying a location of the extracted 
frame in the source video data and second informa- 
tion relating to a display time of the extracted frame; 
a unit configured to obtain the video data corre- 
sponding to the extracted frame based on the first 
information; 

a unit configured to determine the display time of 

th^xtracte^ra^^ inf orma" 

tion; and 

a unit configured to display the obtained video data 
forthe determined display time. 

[0010] According to another aspect of the present in- 
vention, an article of manufacture comprising a method 
of performing a special reproduction comprises: 

referring to frame information described for a frame 
extracted from a plurality of frames in a source video 
data and including first Information specifying a lo- 
cation of the extracted frame and second informa- 
tion relating to a display time of the extracted frame; 
obtaining the video data corresponding to the ex- 
tcacted irame-based-op Jthe-firstJnf o rmationf 



"determinihgW 

based on the second information; and 
displaying the obtained video data for the deter- 
mined display time. 

[0011] According to another aspect of the present in- 
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vention, Rn article of manufaclure comprising an article 
of manufacture comprising a computer usable medium 
having computer readable program code means em- 
bodied therein, the computer readable program code 
means performing a special reproduction, the computer 5 
readable program code means comprises: 

computer readable program code means for caus- 
ing a computer to refer to frame information de- 
scribed for a frame extracted from a plurality of 10 
frames in a source video data and including first in- 
formation specifying a location of the extracted 
frame and second information relating to a display 
time of the extracted frame; 

computer readable program code means for caus- 15 
ing a computerto obtain the video data correspond- 
ing to the extracted frame based on the first Infor- 
mation: 

computer readable program code means for caus- 
ing a computer to determine the display time of the 
extracted frame based on the second information; 
and 

computer readable program code means for caus- 
ing a computer to display the obtained video data 
for the determined display time. 25 

[001 2] According to another aspect of the present in- 
vention, an article of manufacture comprising a method 
of describing sound information, the method comprises: 

30 

describing, for a frame extracted from a plurality of 
sound frames in a source sound data, first informa- 
tion specifying a location of the extracted frame in 
the source sound data; and 

describing, forthe extracted frame, second informa- 35 

tiDnjelating-tD^xeprAd^ 

dyction time of the sound data of the extracted 
frame. 

[0013] According to another aspect of the present in- *° 
vention, an article of manufacture comprising an article 
of manufacture comprising a computer usable medium 
storing frame information, the frame information com- 
prises: 

45 

first information, described for a frame extracted 
from a plurality of sound frames, specifying a loca- 
tion of the extracted frame in the source sound data; 
and 

second information, described for the extracted so 
frame, relating to a reprodu ction start time and re- 

production time oTthe-sound data -of-the extracted 

frame. 



describing, for a frame extracted from a plurality of 
text frames in a source text data, first information 
specifying a location of the extracted frame in the 
source text data; and 

describing, for the extracted frame, second informa- 
tion relating to a display start time and display time 
of the text data of the extracted frame. 

[0.015] According to another aspect of the present in- 
vention, an article of manufacture comprising an article 
of manufacture comprising a computer usable medium 
storing frame information, the frame information com- 
prises: 

first information, described for a frame extracted 
from a plurality of text frames in a source text data, 
specifying a location of the extracted frame in the 
source text data; and 

second information, described for the extracted 
frame, relating to a display start time and display 
time of the text data of the extracted frame. 

[0016] This summary of the invention does not nec- 
essarily describe all necessary features so that the in- 
vention may also be a sub-combination of these -de- 
scribed features. 

[0017] The present invention can be implemented ei- 
ther in hardware or on software in a general purpose 
computer. Further the present invention can be imple- 
mented In a combination of hardware and software. The 
present invention can also be Implemented by a single 
processing apparatus or a distributed network of 
processing apparatuses. 

[001 B] Since the present Invention can be implement- 
ed by software, the present invention encompasses 
■Pjornp^r code provided to a g eneral purpos e com puter 
on any suitable carrier medium. The carrier medium can 
comprise any storage medium such as a floppy disk, a 
CD ROM, a magnetic device or a programmable mem- 
ory device, or any transient medium such as any signal 
e.g. an electrical, optical or microwave signal. 
[0019] The invention can be more fully understood 
from the following detailed description when taken in 
conjunction with the accompanying drawings, in which: 

. FIG. 1 is a view showing an example of a data struc- 
ture of special reproduction control information ac- 
cording to one embodiment of the present inven- 
tion; 

FIG. 2 is a view showing an example of a structure 
of a special reproduction control information creat- 



[001 4] According to another aspect of the present in- 
vention, an article of manufacture comprising a method 
of describing text information, the method comprises: 



—ing apparafusTn.l ZZZZZ L.l 

FIG. 3 is a view showing another example of struc- 
— — — — ture -of -the -special-reproduction-eontre l-information - 
55 creating apparatus; 

FIG. 4 is a flowchart showing one example forthe 

apparatus. shown in FIG. 2; 

FIG. 5 is a flowchart showing one example forthe 
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apparatus shown In FIG. 3; 
FIG, 6 is a view showing an example of a structure 
of a video reproduction apparatus; 
FIG. 7 is a flowchart showing one example for the 
apparatus shown in FIG. 6; 
FIG. 8 is a view showing an example of a data struc- 
ture of special reproduction control information; 
FIG. 9 is a view explaining video location informa- 
tion for referring to an original video frame; 
FIG, 10 is a view explaining video location informa- 
tion for refening to a image data file; 
FIG. 1 1 is a view explaining a method for extracting 
video data in accordance with a motion of a screen: 
FIG. 12 is a view explaining video location informa- 
tion tor relerring to the original video frame; 
FIG. 1 3 is a view for explaining video location infor- 
mation for referring to the image data file; 
FIG. 14 ts a view showing an example of a data 
structure of special reproduction control information 
in which plural original video frames are refened to; 
FIG. 15 is a view explaining a relation between the 
video location information and the original plural 
video frames; 

FIG. 1 6 is a view explaining a relation between the 
image data file and the original plural video frames; 
FIG. 1 7 is a view explaining video location Informa- 
tion for referring to the original video frame; 
FIG. 18 is a view for explaining video location infor- 
mation for referring to the image data file; 
FIG. 1 9 is a flow chart for explaining a special re- 
production; 

FIG.20isaviewforexpiainingamethodforextract- 
ing video data in accordance with a motion of a 
screen; 

FIG, 21 is a viewforexplaining amethod for extract- 
ing video data in accordance with a motion of a 
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screen; 

FIG. 22 is a flowchart showing one example for cal- 
culating display time atwhich ascenechange quan- 
tity becomes constant as much as possible; 
FIG. 23 is a flowchart showing one example for cal- 
culating a scene change quantity of the whole frame 
from an MPEG video; 

FIG. 24 is a view for explaining a method for calcu- 
lating a scene change quantity of a video from an 
MPEG stream; 

FIG. 25 is a view for explaining a processing proce- 
dure for calculating display time at which a scene 
change quantity becomes constant as much as pos- 
sible; 

FIG. 26 is a flowchart showing one example of the 
-^roeesj^g-f^ocedurjHorco^^ 
"duction on the basis of special reproduction control 

information; 

FIG. 27 is a flowchart showing one example for con- 
ducting special reproduction on the basis of a dis- 
play cycle; 

FIG. 28 is a view tor explaining a relationship be- 
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rween a calculated display time and the display cy- 
cle; 

FIG. 29 is a view for explaining a relationship be- 
tween a calculated display time and the display cy- 
cle; 

FIG . 30 is a view showing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG. 31 Is a view explaining a method for extracting 
video data in accordance with a motion of a screen; 
FIG. 32 is a view explaining video location informa- 
tion for referring to the original video frame; 
FIG. 33 is a view showing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG. 34 is a view showing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG . 35 is a view showing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG. 36 is a flowchart showing one example for cal- 
culating display time from the importance; 
FIG. 37 is a view for explaining a method for calcu- 
lating display time from the importance; 
FIG. 38 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that a scene having a large sound level is important; 
FIG. 39 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that a scene on which many important words appear 
with sound recognition is important, or a processing 
procedure for calculating importance data on the 
basis of the idea that the scene in which the number 
of words talked per time is many is important; 
FIG. 40js_a flowchart showing one exam ple f or cal^ 
culating importance data on the basis of the idea 
that a scene on which many important words appear 
with telop recognition is important, or a processing 
procedure for calculating importance data on the 
basis of the idea that the scene in which the number 
of words included in the telop which appears per 
time is large with telop recognition Is important; 
FIG. 41 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that the scene in which a large character appears 
as a telop is important; 

FIG. 42 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that the scene in which many human faces appear 
is important or a processing for calculating impor- 



meejdalaTBrnr]^ 

where human faces are displayed in an enlarged 
manner is important; 

FIG. 43 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that the scene in which videos similar to the regis- 
tered important scene appear is important; 
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FIG. 44 is a view showing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG. 45 is a view showing another example of a data 
structure ot special reproduction control informa- 5 
tion; 

FIG. 46 Is a viewshowing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG. 47 is a view for explaining a relationship be- 10 
tween information as to whether the scene is to be 
reproduced or not and the reproduced video; 
FIG. 48 is a flowchart showing one example of a 
processing procedure of special reproduction in- 
cluding reproduction and non-reproduction judg- 1$ 
ment; 

FIG. 49 is a view showing one example of a data 
structure when sound information or text Informa- 
lion is added; 

FIG. 50 is a view showing one example of a data 20 
structure for describing only sound information sep- 
arately from frame Information; 
FIG, 51 is a view showing one example of a data 
structure for describing only text information sepa- 
rately from frame information; & 
FIG. 52 is a view for explaining a synchronization 
of a reproduction of each of media; 
FIG. 53 is a flowchart showing one example of a 
determination procedure of a sound reproduction 
start time and a sound reproduction time in a video so 
frame section; 

FIG. 54 is a flowchart showing one example for pre- 
paring reproduction sound data and correcting vid- 
eo frame display time; 

FIG, 55 is a flowchart showing one example of a 35 

jDrAcessinqjrocedure of obt aining t ext information 

with telop recognition; 

FIG. 56 is a flowchart showing one example of a 
processing procedure of obtaining text information 
with sound recognition; *o 
FIG. 57 is a flowchart showing one example of a 
processing procedure of preparing text information; 
FIGS. 5BA and 58B are vjewsf pr explaining a meth- 
od of displaying text information; 
FIG. 59 is a view showing one example of a data 45 
structure of special reproduction control Information 
for sound information; 

FIG. 60 is a viewshowing another example of a data 
structure of special reproduction control information 
for sound information; 50 
FIG. 61 Is a view explaining a summary reproduc- 

— -tion ot-the^ound/music-dafa;3nd 

FIG. 62 is a view explaining another summary re- 
; prcfatfctiDrrpfth — — 

[0020] Preferred embodiments of the present inven- 
tion will now be described with reference to the accom- 
panying drawings. 



[0021] The embodiments relate to a reproduction of 
video contents having video data using special repro- 
duction control information. The video data comprises 
. a set of video frames (video frame group) constituting a 
motion picture, 

[0022] The special reproduction control information is 
created from the video data by a special reproduction 
control information creating apparatus and attached to 
the video data. The special reproduction is reproduction 
by a method other than a normal reproduction. The spe- 
cial reproduction includes a double speed reproduction 
(or a high speed reproduction), jump reproduction (or 
jump continuous reproduction), and a trick reproduction. 
The trick reproduction includes a substituted reproduc- 
tion, an overlapped reproduction, a slow reproduction 
and the like. The special reproduction control informa- 
tion is referred to when the special reproduction is exe- 
cuted in the video reproduction apparatus. 
[0023] FIG. 1 shows one example of a basic data 
structure of the special reproduction control information. 
[0024] In this data structure, plural items of frame in- 
formation T (i= 1 to N) are described in correspondence 
to the frame appearance order in the video data. Each 
frame information 100 includes a set of video location 
information 101 and display time control information 
102. The video location information 101 Indicates a lo- 
cation of video data to be displayed at the time of special 
reproduction. The video data to be display may be one 
frame, a group of a plurality of continuous frames, or a 
group formed of a part of a plurality of continuous 
frames. The display time control information 1 02 forms 
the basis of calculating the display time of the video da- 
ta. 

[0025] In FIG. 1 , the frame information T is arranged 
in an order of the appearance of frames in the video da- 
ta. W hen Inf ormation indicating an order of frame infor- 
mation is described in the frame information "i", the 
frame information T may be arranged and described in 
any order. 

[0026] The reproduction rate information 103 at- 
tached to a plurality of items of frame information T 
shows the reproduction speed rate and is used for des- 
ignating the reproduction at a speed several times high- 
er than that corresponding to the display time as de- 
scribed by the display time control information -102. 
However, the reproduction rate information 103 Is not 
essential Information. The information 103 may con- 
stantly be attached; not constantly be attached, or se- 
lectively attached. Even when the reproduction rate In- 
formation 103 is attached, the information may not be 
used at the time of special reproduction. The reproduc- 
TJoTTfiltflinfQ 

constantly used, or is selectively used. 
[0027]---lnFIGH-;it-ispossibleto-further-add-other*con— 
trol information to the frame information group together 
with the reproduction rate information or in place of the 
reproduction rate information, in FIG. 1 , it is also possi- 
ble to add different control information to each frame in- 




9 



EP 1 1 68 840 A2 



10 



formation T. In these cases, each information included 
in the special reproduction oontrol information may be 
all used on the side ot the video reproduction device, or 
a part ot the inlormation may be used, 
[0D28] FIG. 2 shows an example ol a structure of an 
apparatus for creating special reproduction control in- 
formation. 

[0029] This special reproduction control information 
creating device comprises a video data storage unit 2, 
a video data processing unit 1 including a video location 
inlormation processing unit 1 1 and a display time control 
. information processing unit 12, and a special reproduc- 
tion control information storage unit 3. In detail, as will 
be described later, since the video data (encoded data) 
is decoded to be video data before displaying, it takes 
a processing time required lor decoding the video data 
from the display instruction is Issued until the video is 
displayed. In order to extracted this processing time, it 
is proposed to decode the video data beforehand and 
store an image data file. 

[0030] if an image data file is used (the image data 
file may be constantly used, or the image data file is se- 
lectively used), an image data file creating unit 13 (In 
the video data processing unit 1 ) and an image data file 
storage unit 14 are further provided as shown In FIG. 3. 
If other control information is added which is determined 
on the basis of the video data to the special reproduction 
control information.thecorrespondingfunction is appro- 
priately added to the inside of the video data processing 
unit 1 . 

[0031] If an operation by a user is intervened in this 
processing, a GUI is used for displaying, for example, 
video data in frame units, and providing a function of 
receiving an input of an instruction by the user though 
omitted in FIGS. 2 and 3. 

[0032] In FIGS. 2 a nd 3, a CPU, a memory, an ext en- 



ts 



processing. .. 
[0038] The video location information processing unit 
11 determines (extracts) a video frame (group) which 
should be displayed or which can be displayed at the 

s time of special reproduction to conduct processing of 
preparing the video location Information 101 which 
should be described in each frame information "i". 
[0039] The display time control information process- 
ing unit 1 02 conducts a processing for preparing the dis- 

10 play time control Information 102 associated with the 
display time of the video frame (group) associated with 
each frame information "I". 

[0040] The image data file creating unit 13 conducts 
a processing for preparing an image data file from the 
video data. 

[0041] The special reproduction control information 
creating apparatus can be realized, for example, In a 
form of conducting software on a computer. The appa- 
ratus may be realUed.as a dedicated apparatus for cre- 
ating the special reproduction control information. 
r0042] FIG. 4 shows an example of a processing pro- 
cedure in a case of a structure of FIG. 2. The video data 
is read (step S11), video location information 1 01 is cre- 
ated (step S12), display time control information 102 is 
created (step S13), and special reproduction control in- 
formation Is stored (step S14). The procedure of FIG. 4 
may be consecutively conducted tor each frame infor- 
mation, and each processing may be conducted In 
batches. The other procedures can also be conducted. 
[00431 FIG. 5 shows an example of a processing pro- 
cedure in a case of the structure of FIG. 3. A procedure 
for preparing and storing Image data files is added to a 
procedure of FIG. 4 (step S22). The image data file is 
created and/or stored together with the preparation of 
the video location information 101 . It is also possible to 
create the video locatiojiinformaf^l^ . 
"ferentf^nTthat of FIG. 4. In the same manner as the 
case of FIG 4, the procedure of FIG. 5 may be conduct- 
ed for each frame information, or may be conducted in 
batches. The other procedures can also be conducted. 
[0044] FIG. 6 shows an example of a video reproduc- 
tion apparatus. 

[0045] This video reproduction apparatus comprises 
a controller 21, a normal reproduction processing unit 
22 a special reproduction processing unit 23, a display 
device 24, and a contents storage unit 25. If contents 
are handled wherein audio such as sound or the like rs 
added to the video data, it is preferable to provide a 
sound output section, tf contents are handled wherein 
text data is added to the video data, the text may be 
dispiayed on the ^^^J^^T, 
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lUUO^J Ml » — J" _- ■ ; — 

n~al ¥tofage"devl^ and a network communication de- 
vice is provided when needed, and software such as 
driver software used when needed and an OS are not 

shown. , , -J* 

[0033] The video datastorage unit 2 stores video data 
which becomes an target of processing for creating spe- 
cial reproduction control information (or special repro- 
duction control information and image data files), 
[0034] The special reproduction control information 
storage unit 3 stores special reproduction control infor- 
maUon that has been created. 
[0035] The image data file storage unit4stores image 
data files that have been created. 
r0036] The storage units 2, 3, and 4 comprise, for ex- 

Lmpie, a hard disk, en optical disk Z!S^S^ 

memoryrT^ 
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"separate storage devices. All orpart of the storage units 
may comprise the same storage device. 
[0037] The video data processing unit 1 creates the 
special reproduction control information (or the special 
reproduction control information and image data file) on 
the basis of the video data which becomes an target of 



"wherein a program Is attached, an attached program ex- 
ecution section may be provided. 
55 [0046] The contents storage unit 25 stores at least 
video data and special reproduction control information, 
in detail, as will be described later, in the case where 
the image data file is used, the image data file is further 
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stored. The sound data : the text data, and the attached 
program are further stored in some cases. 
[0047] The contents storage unit 25 may be arranged 
atone location in a concentrated manner, or may be ar- 
ranged in a distributed manner. The point is thatthe con- 
tents can be accessed with the normal reproduction 
processing unit 22 and special reproduction processing 
unit 23. The video data, special reproduction control in- 
formation, image data files, sound data, text data, and 
attached program may be stored in separate media or 
may be stored in the same medium. As the medium, for 
example, DVD is used. These may be data which are 
transmitted via a network. 

[0048] The controller21 basically receives an instruc- 
tion such as a normal reproduction and a special repro- 
duction with respect to the contents from the user via a 
user interface such as a GUI or the like. The controller 
21 controls for giving to the corresponding processing 
unit an instruction of reproduction by means of a method 
designated with respect to the designated contents. 
[0049] The normal reproduction processing unit 22 is 
used forthe normal reproduction of the designated con- 
tents. 

[0050] The special reproduction processing unit 23 is 
used (or the special reproduction (for example, a high 
speed reproduction, jump reproduction, trick reproduc- 
tion, or the like) of the designated contents by referring 
to the special reproduction control information. 
[0051] The display device 24 is used for displaying a 
video. 

[0052] The video reproduction apparatus can be real- 
ized by computer software. It may partially be realized 
by hardware (for example, decode board (MPEG-2 de- 
coder) or the like). The video reproduction apparatus 
may be realized as a dedicated device for video repro- 
duction. 



each frame is equally sel to a constant value, it Is not 
necessary to determine the display time. 
[0054] Both in the normal reproduction and In the spe- 
cial reproduction, the user may demand various desig- 

5 nations (for example, the start point of the reproduction 
or the end point of the reproduction in the contents, a 
reproduction speed in the high speed reproduction, and 
reproduction time in the high speed reproduction, and 
other method, such as special reproduction or the like). 

10 [0055] Next, an algorithm for creating the frame infor- 
mation of the special reproduction control information 
and an algorithm for calculating the display time of the 
special reproduction will be schematically explained. 
[0056] At the time of creating the frame Information, 

15 the frame information to be used at the time of the spe- 
cial reproduction is determined from the video data, the 
video location information Is created, and the display 
time control information Is created. 
[0057] The frame is determined by such methods as; 

20 1) a method for calculating the video frame on the basis 
of some characteristic quantity with respect to the video 
data (for example, a method for extracting the video 
frames such that the total of characteristic quantity (for 
example, the scene change quantity) between the ex- 

25 tracted frames becomes constant and a methodf or. ex- 
tracting the video frames such that the total of impor- 
tance between the extracted frames becomes con- 
stant), and (2) a method for calculating the video frame . 
on a fixed standard (for example, a metho d for extracting 

so frames at random, and a method for extracting frames 
at an equal interval). The scene change quantity is also 
called as a frame activity value. 
[0058] In the creation of the display time Qontrol infor- 
mation 121, there are available; (i) a method for calcu- 

35 lating an absolute value or a relative value of the display 
time or a display frame number, (ii) a method for calcu- 



[0053] FIG. 7 shows one example of a reproduction 
processing procedure of the video reproduction appa- 
ratus of FIG. 6. At step S31, it is determined whether 
user requests a norma! reproduction or a special repro- 
duction. When a normal reproduction is requested, the 
designated video data is read al step S32 and a normal 
reproduction is conducted at step S33. When a special 
reproduction is requested from the user, the special re- 
production control information corresponding to the des- 
ignated video data is read at step S34, the location of 
the video data lo be displayed is specified and the dis- 
play lime is determined at step S35. The corresponding 
frame (group) is read from the video data (or the Image 
data file) at step S36 to conduct special reproduction of 
the designated contents at step S37. The location of the 
~IvT3e o. data . can. 0 e specif io^7an^ f 1m di'sf5 j |y llm^^TTSCr: 
determined at a timing different from that in FIG. 7. The 
rprbceS^ 

consecutively conducted for each frame information, or 
each processing may be conducted in batches. Other 
procedures can be conducted. For example, in the case 
of the reproduction method in which the display time of 



40 



45 



50 



Tating7efe7e1^e~fnfo"rmatlon whTcrT"is~a base oTfhe dis- 
play time and a display frame number (for example, the 
information designated by the user, characters in the 
video, sound synchronized with video, and persons in 
the video, and the importance obtained on the basis of 
the specific pattern in the video), (111) a method for de- 
scribing both (I) and (ii). 

[0059] It is possible to appropriately combine (1) or 
(2) and (I), (li) or (III). Needless to say, other methods 
can be possible. One specific combination out of such 
methods can be used, and a plurality or combinations 
of these methods may be used and can be appropriately 
selected. 

[0060] In a specific case, at the same time with the 
determination of the frame al the method (1), a relative 
-valag"ofThe-Tiisp1ay-time~and-t-h e-num ber-of~d isplay- 



frames are determined. If this method is constantly 

^sed7iHs^osslble-to<3mitthe'display-time-oontrol-infer---- 

55 rnatiDn processing unit 1 02. 

[0061] At the time of the special reproduction, it is as- 
sumed thatthe special reproduction is conducted by re- 
ferring to the display time control information 121 of (i), 



13 



EP 1 168 840 A2 



14 



10 



15 



20 



25 



0 n or (Hi) included in the irame information. However 
L described value may be lollowed or the described 
Ta ue maTbe cor ecied and used, in addition to the de- 
sc led value and the corrected value thereof inde- 
eendently c eated other information, and information in- 
P pu rom the user may be used. Alternative y only - h* 
independently created other information and he for- 
mation inpul rom the user may be used. A plurality of 
Sods out of these methods are enabled and can be 

"production) carries out reproduction in a time shorter 
han the time required for the normal reproduct.on of the 
^ reproducing a part of the .frame* out 
Se whole frames constituting the video date contente. 
For Sample, the frames Indicated by the frame Infer. 
%E£* splayed for each display time indicated by 
display time control information 1 21 . in the orde. o 
i sequence. Based on a request from the user such 
Ta 8P ead Agnation request for designating at what 
Zes speed of the normal reproduction the onginal con- 
ten" are reproduced (in what factor of the time required 
otTnom^r— 

produced) and a time designation request for dengnet 
Khowmuchtimeistakenforreproducmgthecotente 

the dsplay time of each frame (group) -sdetermi ed to 
satisfy the reproduction request The high speed repro- 
duction is called a summarized reproduction 
[0064] A jump reproduction (ora jump ^°^ p 
production is such thata part of theframe shownjthe 
frame information is subjected to non-reproduction, for 
SeTthe basis of the reproducUon/non-repro. 

production. The high ^•^^^^^Snr 
wlth- Te^eWmMimme-e^irigThelrame Which «s 

3eS non-reproduction out of the frames shown 
in shown in the frame information. 
iotsT A tricK reproduction excludes from the repro- 
duction except for the norma, reproducbo , the high 
speed reproduction and the jump reproduction. For ex 
ample at the time of reproducing the frame shown in 
Xme donation, there can be 
forms such as a substituted reproduction for reproduc 
'^certain portion by replacing the order of t,me se- 
quence, an overlapped reproduction for reproducing a 
certain portion repeatedly a plurality of times atthe time 
of Producing the frame shown in frame information a 
variable speed reproduction in which at the time of re- 
; a mdX P tne frame shown in the frame ^^on * 
l rta in-portion-is-reproduced.at^^ 

which the portion is reproduced at the speed of normal 
Teproduction, orthe case In which the porton is mjjjh 
duced at a speed lower than the normal reproduction 
S or at a speed higherthan another portion, or the 
eproduction 0 °a certain portion is temporarily suspend- 



ed, or such forms of reproduction arc WJ*J«J 
Wn P d a random reproduction for reproducing at a ran 

c hnu/n in the frame information. . 
rooS Needless lo say. 11 is possible to appropriately 
omb ne a plurality of kinds of meihods. For example 
at the time of the double speed, the important portion is 
SIS a plurality of times, and various variations 

tion speed to a normal reproduction speed. 
0067? Hereinafter, embodiments of the present in- 
tention will be specifically explained in detail. 
SB] in the. ginning, the embodiments win be ex- 
piained by taking as an example a » 
production frame is determined on the bas.s ol he 
scene change quantity between adjacent frames as the 
characteristic quantity of the video data. 
ooeT Here'there w,.l be explained a ^ln* 

iri mo iq corresponded lo one frame information, 
SS^S^SSon. example of a data structure 
of me special reproduction control information created 

Enltion 1 21 is described which is information show- 
absolve or a relate dispiay time as display time 
coZlTn elation 102 in FIG. 1 (or instead of the d; S - 
pS time control information 102). A structure desc b- 
£gtoe^portanceinaddi«ontothedisplayt.mecontrol 

w^rr^atinn 102 will be described later. 
SSJ^ISS location information 101 Is infomna- 

orlainal video frame of the video, and any ol a trame 

frame) or a number which specifies one frame ma 
Seam Shi a time stamp may be used. If the video data 
corresponding to the frame ^from£eo^_ 
L may be used as infonmation for specifying the file 
rao^Thedisplayt^ 

* s freSonship of the relative time length with the dls- 

STn m the latter case, the actual reproduction lime of 

«f a wSe With respectto each video, the contmuaton 
rmeXdMayisnotdescribed.butsuchdescnpt-on 

^^^^L^-a^dScMm with a combi- 
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40 



SO 



55 



naSon Z f the start time and the continuation time may 

Mwf Inmespecialrep^ 

Son of the video present at a location specrfiec wtth 

the video location Information 101 only for the display 
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time specified with the display time information 121 is 
consecutively conducted only for the number of the. 
Items of frame information T included in the arrange- 
ment, such as shown in FIG. 8. 

[0075] If the start time and the end time or the contin- 5 
uation time are specified and this designation is fol- 
lowed, the video present at the location specified with 
the video location information 101 is consecutively re- 
produced from the start time specified with the display 
time information 121 up to the end time or during the 10 
continuation time only for the number of items of the 
frame information "I" Included in the arrangement. 
[0076] The described display time can be processed 
and reproduced by using parameters such as reproduc- 
tion rata information and additional information. 15 
[0077] Next, a method for describing the video loca- 
tion Information will be explained by using FIGS. 9 
through 11. 

[0078] FIG. 9 explains a method for describing the vid- 
eo location information referring to the original video 20. 
frame. 

[0079] In FIG. 9, a time axis 200 corresponds to the 
original video stream based on which the frame infor- 
mation for the special reproduction is created and a vid- 
eo 201 corresponds to one frame which becomes a de- 25 
scription target in the video stream. A time axis 202 cor- 
responds to reproduction time of a video at the time of 
the special reproduction by using the video 201 extract- 
ed from the original video stream. A display time 203 is 
a section corresponding to one video 201 included in 30 
the display time 203. For example, the video location 
information 101 showing the location of the video 201 
and the video display time 121 showing the length of the 
display time 203 are described as frame information. As 
described above, the description on the location of the 35 
j«deo_£0Ln!ayJ?e_g^^ 



of the corresponding display time 301 is described as 
frame information. 

[0082] If a correspondence to the original video frame 
is required, the information (similar to the video location 
information in the case of, for example, FIG. 9) showing 
a single frame 302 of the original video corresponding 
to the described frame information may be included In 
the frame information. The frame information may com- 
prise the video location Information, the display time in- 
formation and the original video information. When the 
original video information is not required, it is not re- 
quired to describe the original video. 
[0083] The configuration of the video data described 
with the method of FIG. 1 0 is not particularly restricted. 
For example, the frame of the original video may be 
used as it is or may be reduced, This is effective for con- 
ducting a reproduction processing at a high speed be- 
cause It is not required to develop the original video. 
[0084] If the original video stream is compressed by 
means of MPEG-1 or MPEG-2 or the like, a reduced vid- 
eo can be created at a high speed only by partially de- 
coding the streams. In this method, only the DCT (the 
discrete cosine conversion) coefficients of an I picture 
frame encoded within the frame (an inner-frame encod- 
ed frame) is decoded and a reduced video is^cteatedby 
using the DCT coefficients. 

[0085] In the description method of FIG. 1 0, the image 
data files are stored in separate files. However, these 
files may be stored in a package in a video data group 
storage file having a video format (for example, a motion 
JPEG) which can be accessed at random. The location 
of the video data is specified by a combination of the 
URL showing the location of the Image data file, a frame 
number or a time stamp showing the location in the im- 
age data file. The URL information showing the location 
of the image data file may be described in each frame 



number, a time stamp or the like as long as one frame 

in the original video stream can be specified. This frame 

information will be described in the same manner with 

respect to the other videos 20 1 . 

[0080] FIG. 10 explains a method for describing the 

video location information referring to the image data 

file, 

[0D81] The method for describing the video location 
Information shown in FIG. 9 directly refers to the frame 
In the original data frame which is to be subjected to the 
special reproduction. The meLhod for describing the vid- 
eo location information shown in FIG. 10 is a method in 
which an image data file 300 con'esponding to a single 
frame 302 extracted from the original video stream is 
created in a separate file, and the location thereof is de- 

handled In the same manner by using, for example, the 
""URCorlhl^ 
on a local storage device and in the case where the file 
is present on the network. A set of the video location 
information 1 01 showing the location of this image data 
file and the video display time 121 showing the length 



inf oTrrMibTfoTTrTay 5e descTi b~e~d~as-ati d ftiunal inf orma-~ 
tion outside of the arrangement of the frame information. 
[0086] Various methods can be taken to select the 
40 frame of the original video or the like and create the vid- 
eo data to describe the video location information. For 
example, the video data may be extracted at an equal 
interval from the original video. Where the motion of the 
screen quite often appears, the video data is selected* 
45 jn a narrow interval. Where the motion of the screen 
quite rarely appears, the video frame Is selected in a 
wide interval. 

[0087] Here, referring to FIG. 11, there will be ex- 
plained a method in which as one example of a method 
so for selecting frames, the frame is selected in a narrow 
interval where the motion of the screen quite often ap- 
^ears-while-the-frame~is-seleGtedJr).-a-wjdeJntej^aL 



where the motion of the screenTafely appearsT 
— -[0088] — In-FIGH'lTa-horizontal-axis-represents-the-se 
55 lected frame number, and a curve 800. represents a 
change in the scene change quantity (between adjacent 
frames). A method for calculating the scene change 
quantity is the same as a method at the time of calcu- 
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Jaling the display lima described later. Here, in order to 
determine an extraction inleivat in accordance with the 
motion of the scene, there is shown a method for calcu- 
lating an interval at which the scene change quantity be- 
tween video frames from which the video data is extract- 
ed becomes constant. The total of the scene change 
quantity between video frames from which the video da- 
la is extracted is set to S i( and the total of the scene 
change quantity in the whole frame is set to S (= ES,) 
while the number of data items to be extracted is n. In 
order to set the video change quantity between video 
frames from which video data is extracted to a constant 
level. S| = S/n may be provided. In FIG. 11 , the area Sj 
o( the scene change quantity curve 800 divided with the 
broken lines becomes constant. Then, for example, the 
scone change quantity is accumulated from the extract- 
ed frame, so that the video frame having the value ex- 
ceeding the S/n is set as the frame F } from which the 
video daLa is extracted. 

[0089] If the video data is created by j picture frame 
of MPEG, the video frame from which the calculated vid- 
eo data Is created is not necessarily the I picture, the 
video data is created from the I picture frame in the vi- 
cinity thereof. 

[0090] By the way, in the method explained in FIG .11, 
the video frame which belongs to the section of the 
scene change quantity = 0 is skipped. However, tf a still 
picture continues, the scene is important in many cases. 
. Then, if the scene change quantity = 0 continues for 
more than a constant time, the frame at that time may 
be extracted. For example, the scene change quantity 
may be accumulated from the extracted frame so that 
the frame having the value exceeding S/n orthe frame 
at which the scene change quantity « 0 continues for 
more than a constant time may be set as a frame F, from 
which the video data is extracted. The accumulatedval- 
— w^llre^c^^Hal^geTiuantiTy may b"e~6Tmay noTbe" 
cleared to 0. It is possible to selectively clear the accu- 
mulated value based on a request from the user. 
[0091] In the case of an example of FIG. 11 , it is as- 
sumed that the display time information 121 is described 
so that the display time becomes the same with respect 
to any of the frames. When the video is reproduced in 
accordance with this display time information 121 , the 
scene change quantity becomes constant. The display 
time information 121 may be determined and described 
in a separate method. 

[0092] Next, there will be explained a case in which 
one or a plurality of frames are allowed to correspond 
to one frame information. 

[0093] One example of the data structure of the spe~ 
ciakoproduction^nforrnation^^^ 
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frames of the original video. 

[0096] A method for describing the video location in- 
formation shown in FIG. 9 refers to one frame 201 in 
one original video for conducting the special reproduc- 
tion. However, the method for describing the video lo- 
cation information shown In FIG. 12 describes a set 500 
of a plurality of continuous frames in the original video. 
The set 500 of frames may include some frames extract- 
ed from the plural continuous frames within the original 
video. The set 500 of frames may include only one 
frame. 

[0097] If the set 500 of frames includes a plurality of 
continuous frames or one frame in the original video, the 
location of the start frame and the location of the end 
frame are described, or the location of the start frame 
and the continuation time of the set 500 are described 
In the description of the frame location (If one frame Is 
included, for example, the start frame is set equal to the 
end frame). In the description or the location and the 
time, the frame number and the time stamp and the like 
are used which can specify frames in the streams. 
[0098] If the set 500 of frames is a part out of a plurality 
of continuous frames in the original video, information 
is described which enables the specification of the 
frames. If the methodfor extracting the frames is deter- 
mined, and the specification of the frames can be spec- 
ified with the description of the locations of the start 
frame and the end frame, the start frame or the end 
frame may be described. 

[0099] The display time information 501 shows the to- 
tal display time corresponding to the whole frame group 
included in the corresponding frame set 500. TTie dis- 
play time of each frame included in the set 500 of frames 
can be appropriately determined on the side of device 
for the special reproduction. As a simple method, there 
is available a method in which the abov e total display 



thaf"in"R(G."8T 
[0094] Hereinafter, a method for describing the video 
location information will be explained by using FIGS. 12 
through 14. 

[0095] FIG. 12 explains a method for describing the 
video location information for referring to the continuous 



time is equa llfBivicleg wKrvthe total numoer oi frames " 

in the set 500 to provide one frame display time. Various 
other methods are available. 

40 [01 00] FIG. 13 explains a method for describing video 
. location information for referring to a set of the image 
data files. 

[0101] The method for describing the video location 
information shown in FIG. 12 directly refers to continu- 
es ous frames in the original video to be reproduced. A 
method for describing the video location information 
shown in FIG. 13 creates a set 600 of the image data 
files corresponding to the original video frame set 602 
extracted from the original video stream in a separate 
50 file and describes the location thereof, in the method for 
describingthefile location, the file can be handled in the 
^ame-mannerty^sing-for^ tne :I itce j_ 
"evenTf "the fileTs present on a local storage device or if 
the file is present on a network. A set of the video loca- 
55 tion information 10l'showing the location of this image 
data file and the video display time 121 showing a length 
of the corresponding display time 601 can be described 
as the frame information. 
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[01 02) If n correspondence with the original frame is 
required, information showing the frame set 602 of the 
original video corresponding to the described frame in- 
formation (for example, information similar to the video 
location information in the case of FIG. 12) may be in- 
cluded in the frame information. The frame information 
may comprise the video location information, the display 
time information and the original video information. The 
original video information is not required to be described 
when the information is not required. 
[0103] The configuration of the video data, the prep- 
aration of the video data, the preparation of the reduced 
video, the method for storing the video data and the 
method for describing the location information such as 
the URL or the like are the same as what has been de- 
scribed above, 

[0104] Various methods can be adopted in the same 
manner as described above as to which frame of the 
original video is selected to create the video data to be 
described In the video location information. For exam- 
ple, the video data may be extracted at an equal interval 
from the original video. Where a motion of the screen 
quite often appears, a frame is extracted in a narrow in- 
terval. Where the motion of the screen rarely appears, 
a frame is extracted in a wide interval. 
[0105] In the above embodiments, the image data file 
300 is corresponded to the original video 302 in a frame 
to frame manner. It Is possible to make the location in- 
formation of the frame described as the original video 
information have a time width. 
[0106] FIG. 14showsan example in which the original 
video information is allowed to have a time width with 
respect to the FIG. B. An original video Information 3701 
is added to the frame information structure shown In 
FIG. B. The original video information 3701 comprises 
_ajtart point information 3702 and a section length In- 



formation 3703 which are the start point and the section 
length of the original video which is a target of the special 
reproduction, The original video information 3701 com- 
prises any information which can specify the section of 
the original video having the time width. It may comprise 
the start point information and an end point information 
in stead of the start point information and the length in- 
formation. 

[0107] FIG, 15 shows an example in which the original 
video information is allowed to have a time width with 
respect to the FIG. 9. In this case, for example, as video 
location information, display time information and origi- 
nal video information included in the same frame infor- 
mation , the location of the original video frame 3B01 , the 
display time 3802, and the original video frame section 

:z3Bw:wiiicircp^ 

and the section length are described to show that these 
liorrespondtoyea^ 

ative of the original video frame section 3803, the orig- 
inal video frame location 3801 described in the video 
location information is displayed. 
[0108] FIG. 16 shows an example in which the original 



information is allowed 1o have a time width with respect 
to the FIG. 1 0. In this case, for example, as video loca- 
tion information, display lime information and original 
video information included in the same frame informa- 

s tion, the location of the image data file 3901 for the dis- 
play, the display time 3902, and the original video frame 
section 3903 which comprises the start point (frame lo- 
cation) and the section length are described to show that 
these correspond to each other. 

10 [01 09] That is, as a video representative of the original 
video frame section 3903, the image 3901 in the image 
data file described in the video location information is 
displayed. 

[0110] Furthermore, as shown in FIGS. 12 and 13, If 
15 a set of frames is used as a video for the display, a sec- 
tion different from the original video frame section for 
displaying the video may be allowed to correspond to 
the original video Information. 

[0111] FIG. 17 shows an example in which the original 
20 video Information is allowed to have a time width with 
respecttothe FIG. 12. In this case, for example, as video 
location information, display time information and origi- 
nal video information included in the same frame infor- 
mation, a set 4001 of frames in the original video, the 
55 display time 4002, and the original video frame section 
4003 which comprises the start point (frame location) 
and the section length are described to show that these 
correspond to each other, 

[0112] At this time, the section 4001 of a set of frames 
30 which are described as video location information, and 
the original video frame section 4003 which is described 
as the original video information are not necessarily re- 
quired to coincide with each other an da different section 
may be used for display. 
35 [01 13] FIG . 1 B shows an example in which the original 
video information is allowed to have a time width with 
respeW61n^FiGTl3;tnnh^ 

location information, display time information and origi- 
nal video information rncluded in the same frame infor- 

40 mation , a set 41 01 of frames in the video file, the display 
time 4102, and the original video frame section 4103 
which comprises the start point (frame location) and the 
section length are described to show that these corre- 
spond to each other. 

45 [01 1 4] At this time, the section of a set 41 01 of frames 
described as video location information, and the original 
video frame section 41 03 described as the original video 
are notnecessariiy required to coincide with each other. 
Thai is, the section of the set 41 01 of the frames for the 

so display may be shorter or longer than the original video 
frame section 4103. Furthermore, a video. having com- 

pletely-differentco ntents-may.be jncluded.tbejjejnJ.aad- 

~ dition, only ]SriicS 

tracted-from the-section-described-in4he- original-video 

55 location as the image data file so that collected video 
data is used. 

[0115] At the time of displaying the videos based on, 
for example, the summarized reproduction (special re- 



21 



EP 1 168 840 A2 



22 



production) using those Herns of the frame informalion, 
it may be desired lhal the corresponding frame in the 
original video is referred to. 

[01 1 6] FIG, 1 9 shows a flow for starting the reproduc- 
tion from the frame of the original video corresponding 
to the video frame displayed in special reproduction. At 
step S3601 , the reproduction start frame Is specified in 
the special reproduction. At step S3602, the original vid- 
eo frame corresponding to the specified frame is calcu- 
lated with a method described later. At step S3603 T the 
original video is reproduced from the calculated frames. 
[01 17] This flow can be used for referring to the cor- 
responding location of the original video in addition to 
special reproduction. 

[0118] At step S3602, as one example of a method 
for calculating the corresponding original video frame, 
there is shown a method for using the proportional dis- 
tribution with respect to display time of the specified 
frame. The display tune information included in the i-th 
frame information Is set to Dj sec, the section start loca- 
tion of the original video information is set to t] sec, and 
the section length is set to dj sec. If the location is spec- 
ified at which t sec has passed from the start of the re- 
production using the i-th frame information, the frame 
location of the corresponding original video is T - tj h- d, 
x t/Dj. 

[01 19] Referring to FIGS. 20 and 21 , as examples of 
a method for selecting a frame, there will be explained 
a method for extracting the frame in a narrow interval 
where the motion of the screen q uite often appears while 
extracting the frame in a wide interval where the motion 
of the screen rarely appears in accordance with the mo- 
tion of the screen. The horizontal axis, the curve BOO, 
and Sj and F } are the same as those in FIG. 11 . 
[0120] In the example of FIG. 11 , the video data is ex- 
tracted one frame after another at an interval at which 
- the ^scene-ch-anpe i)uaTitlly-tyetwe^n-^h ^r^eirf ranr 
which the video data is extracted is made constant 
.FIGS. 20 and 21 show examples in which a set of a plu- 
rality of frames are extracted based ori the frame Fj as 
reference. For example, as shown in FIG. 20, the same 
number of continuous frames may be extracted from F- t . 
The frame length 811 and the frame length 812 equal to 
each other. As shown in FIG. 21, the corresponding 
number of continuous frames may be extracted so that 
the total of the scene change quantity from F, becomes 
constant. The area 813 and the area 814 equal to each 
other. Various other methods can be considered. 
[01 21] It is possible to use the frame selection method 
In which the frame is extracted when the scene change 
quantity = 0 continues for more than a constant time. 
_[0122]-JVsJn4heJ3ase^f~FIG,J1^^ 



formation IflJTnay be ^"discnbeli "sblhatlhe same dis- 
play time may be provided with respect to any of frame 
sets in the cases of FIGS. 20 and 21 . Alternatively, the 
display time information may be determined and de- 
scribed in a different method. 
[0123]. Next, one example of a processing for calcu- 



lating the display time will be explained. 
[0124] FIG. 22 shows one example of a procedure of 
the basic processing for calculating the display time so 
that the scene change quantity becomes constant as 
5 much as possible when the video described in the video 
location information is continuously reproduced in ac- 
cordance with time described in the display time infor- 
mation. 

[0125] This processing can be applied to a case in 
10 which the frames are extracted in any method. For ex- 
ample, if the frames are extracted in a method shown in 
FIG. 11, the processing can be omitted. Since the 
processing shown in FIG, 11 selects the frames such 
that the scene change quantity becomes constant when 
15 the frames are displayed for a fixed time period. 

[01261 At step S71, the scene change quantity be- 
tween adjacent frames is calculated with respect to all 
frames of the original video. If each frame of the video 
is represented In bit map, the differential value of the 
20 pixel between adjacent frames can be set to the scene 
change quantity. If the video is compressed with MPEG, 
the scene change quantity can be calculated by using 
amotion vector. 

[0127] One example of a method for calculating the 
25 scene change quantity will be explained. 

[0128] FIG. 23 shows one example of a basic 
processing procedure for calculating a scene change 
quantity of all frames from the video streams com- 
pressed with MPEG. 
30 [01 29] At step S B1 , a motion vector is extracted from 
the P picture frame. The video frame compressed with 
the MPEG is described with an arrangement of I picture 
(an inner-frame encoded frame), P picture {an inter- 
frame encodedframe in aforward prediction), and Bpic- 
35 ture (an inter-frame encoded frame in a backward pre- 
diction), as shown in FIG. 24. The P picture includes a _ 

ffimicTTVecrerc^^ rromtne pre-" " 

ceding I picture or P picture. 

[0130] At step S82, the magnitude (intensity) of the 
40 each motion vector included in the frame of one P pic- 
ture is calculated, and an average thereof is set as a 
scene change quantity from the preceding I picture or P 
picture. 

[0131] At step S83, on the basis of the scene change 
45 quantity calculated with respect to the P picture, the 
scene change quantity Is calculated for each one frame 
corresponding to the frame other than the P picture. For 
example, if the average value of the motion vector of the 
P picture frame is p, and the interval from the preceding 
so i picture or P pictu re from which the video is referred to 
Is d, the scene change quantity per one frame of each 

frame- is -set4o-p/d — — — — — — — TZZZ 

[0132] " Subsequently, at step S72 in the procedure of 

FIG. 22, the total of the scene change quantity of frames 
55 between the following description target frames Is cal- 
culated from the description target frame described In 
the video location information. 
[0133] FIG, 25 describes a change in the scene 
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change quantity for each one frame. The hori7ontal axis 
corresponds to the Irame number while a curve 1000 
denotes a change in the scene change quantity. If the 
display time of the video having the location information 
of the frame information F ( Is calculated, the scene 
change quantity In the section 1 001 up to F, +1 is added 
which corresponds to the frame location of the next de- 
scription target frame. It is considered that this becomes 
an area S, of the hatching portion 1 002, which is a mag- 
nitude of a motion of the frame location Fj. 
[0134] Subsequently, at step S73 in the procedure of 
FIG. 22, the display time of each frame is calculated, in 
order to set the scene change quantity to a constant lev- 
el as much as possible, a larger quantity of the display 
time may only be allocated to the frame where the mo- 
tion ofthe screen is large, so that the ratio of the display 
time allocated to the video of each frame location F } to 
. the reproduction time may be set to S/LSj. When the 
Lotal of the reproduction time is set to T, the display time 
of each video will be set to D ( = T x S/SSj. The value of 
the total T of the reproduction time Is defined as the total 
reproduction time of the original video. 
[0135] If no scene change appears and Sj= 0, the low- 
er limit value (for example, 1) which is calculated in ad- 
vance may be entered, or the frame information thereof 
may not be described. Even with respect to the frame 
where the screen change is very small even if Sj = 0 is 
not provided and virtually no change is displayed on the- 
actual reproduction, the lower limit value may be substi- 
tuted and no frame information may be described. If no 
frame information is described, the value of S ] may be 
added to S w or may not be added thereto. 
[0136] The processing for calculating this dispiaytime 
can be conducted forthe preparation of the frame infor- 
mation with the special reproduction control information 
creating appjratos^buj^^ be conduct- 
ed at the time of the special re~p"roduction"on the side of 
the video reproduction apparatus. 
[0137] Next, there will be explained a case in which 
the special reproduction is conducted. 
[0138] FIG. 26 shows one example for the N times 
high-speed reproduction on the basis of the special re- 
production control information that has been described. 
[0139] At step S11 1 , the display time D'j at the time of 
reproduction is calculated on the basis of the reproduc- 
tion rate Information. The display time information de- 
scribed in the frame information is standard display lime, 
the display time D', = D/N of each frame is calculated 
when reproduction at N times high-speed is conducted. 
[01 40] At step S112, initialization for the display is 
conducted, and i « 0 is set so that the first frame infor- 

— mwofuiorfpigy w ~i ~ l z 

[0141] AtstepS113, it is determined whether the dis- 
"pl^iffiFP^^^ 
the threshold value of the preset display time, 
[0142] if the display time is larger, the video location 
information included in the i-th frame information F ( is 
displayed for D'j seconds at step S114. 



[0143] If the display time is not larger, the process pro- 
ceeds to step S115 to search the i-th frame information 
which is not smallerthan the threshold value in aforward 
direction. During search, the display time of the frame 
5 information which js smaller than the threshold value of 
the display time Is all added to the display lime of the I- 
th frame information. The display time of the frame in- 
formation which is smallerthan the threshold value of 
the display time is set to 0. The reason why such 
10 processing is conducted is that the time for preparing 
the video to be displayed becomes longer than the dis- 
piaytime when the display time at the time of reproduc- 
tion becomes very short with the result that the display 
cannot be conducted in time. Then, if the display time 
15 becomes very short, the process proceeds to the next 
step without displaying the video. At that time, this dis- 
piaytime of the video which Is not displayed is added to 
the display time of the video to be displayed so that the 
lolal display time becomes unchanged. 
20 [0144] At step S116, it is determined whether T is 
smallerthan the total number of the frame information 
items in order to determine whether or not the frame in- 
formation which is not displayed remains. If T is lower 
than the total number of tho frame information Items, the 
25 process proceeds to step S11 7 to increment T by one 
to create for the display of the next frame Information. 
When T reaches the total number of the frame informa- 
tion items, the reproduction processing is completed. 
[01 45] FIG. 27 shows one example for conducting the 
so N times high-speed reproduction on the basis of the de- 
scribed special reproduction control information by tak- 
ing the display cycle as a reference. 
[0146] AtstepS121 , the dispiaytime D'j of each frame 
is calculated as D'j » Dj/ N at the N times high-speed 
35 reproduction. Here, the calculated display time is actu- 
ally associated with the display cycle so that the video 

~~ cannbTbe alWBTs^isTyla^exiln-aT^lciilated-lime-^ 

[0147] FIG. 2B shows a relationship between the cal- 
culated display time and the disp lay cycle. The tirne axis 
40 1 300 shows the calculated display time while the time 
axis 1 301 shows the display cycle based on the display 
rate. If the display rate is f frame/sec, an interval of the 
display cycle becomes 1/f sec. 
[0148] Consequently, at step S122, the frame infor- 
45 mation F, including the start point of the display cycle is 
searched while the video included In the frame informa- 
tion Fj is displayed for one display cycle (1/f sec) at step 
S123. 

[0149] For example, the display cycle 1302 (FIG. 28) 
so displays the video of the frame information correspond- 
ing to this display time because the display start point 

^1303-iS"included^n4hG-calculated.dispJayJJmeJ^)4. 

" [01 50] ~~ A~rnethod foral lbwlng"the"display cycle corre-" 

spond to^he-frameHnformation-may-display-the-videoat^ 

55 the nearest location of the start point of the display cycle, 
as shown in FIG. 29. If the dispiaytime becomes smaller 
than the display cycle like the display time 1 305 of FIG. 
28, the display of the video may be omitted. If the video 
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is forcibly displayed, the display time before and afler 
the video is shortened to adjust so that the total display 
lime becomes unchanged. 

[01 51] At step S1 24, it is determined whether the cur- 
rent display is the final display or not. If the current dis- 
play is the final display, the processing is completed. If 
the display is not the final display, the process proceeds 
to step S125 to conduct the processing of the next dis- 
play cycle. 

[0152] FIG. 30 shows another example of a data 
structure for describing the frame information. The 
frame information included in the data structure of FIG. 
8 or FIG. 14 summarizes a single original video. A plu- 
rality of original videos can be summarized by expand- 
ing the frame information. FIG. 30 shows such an ex- 
ample. An original video location information 4202 for 
indicating the original video file location is added to the 
original video information 4201 included in the individual 
frame information. The file described in the original vid- 
eo location information 4202 is not necessarily required 
to handle the entire file. The file can be used in the form 
in which only a portion of the section is extracted. In this 
case, not only file Information such as a file name or the 
like but also the section information showing which sec- 
tion of the file becomes an object are additionally de- 
scribed. Plural sections may be selected from the orig- 
inal video. . . 
[01 53] Furthermore, if several kinds of the ongmal vid- 
eos are present and identification Information is individ- 
ually added to the videos, the original video identification 
information may be described in place of the original vid- 
eo location information. 

[01 54] FIG. 31 explains an example in which a plural- 
ity of original videos are summarized and displayed by 
using the frame information added with the original vid- 
eo location information. In this example, three vldeos_ 
— wsamrrmnzedTo aisplay^esummarizea viaeo.-WIffi 
respecttothevideo2, in place of the whole section.two 
sections 4301 and 4302 are taken out to handle the re- 
spective videos. As the frame information, together with 
these original video information, the frame location 
(4303 with respectto 4301 ) of respective representative 
video is described as the video location information 
while the display time (4304 with respectto 4301) is de- 
scribed as the display time Information. 
[0155] FIG. 32 explains another example in which a 
plurality of original videos are summarized and dis- 
played by using the frame information added with the 
original video location information. In this example, 
three videos are summarized to display one summa- 
rized video. With respect to the video 2, In place of the 

whole-secrtonra-portfon^f-the-section^^ 

— pluraliVorsectionrmaV be taken out as descnbed in 
FIG. 31 . As the frame information, together with these 
items of the original video information (for example, the 
section information 4401 in addition to the video 2). the 
storage location of respective representative video files 
4402 is described as the video location Information and 



the display time 4403 Is described as display time infor- 

[0156] Addition of the original video location informa- 
tion to the frame information which has been explained 
in these examples can be applied completely in the 
same way to the case in which a set of frames is used 
as video location information with Ihe result that a plu- 
rality of original videos are summarized and displayed. 
r01 571 FIG 33 shows another data structure for de- 
scribing the frame information. In this data structure in 
addition to the video location information 101, the dis- 
play time information 121 and the original video infor- 
mation 3701 which has been already explained, a mo- 
tion information 4501 and interest region information 
4502 are added. The motion information 4501 describes 
a magnitude of a motion (a scene change quantity) in a 
section (the section described in the original video infor- 
mation) of the original video corresponding to the frame 
information. The interest region information 4502 refers 
to a description of the information which should be par- 
ticularly interested in the video which is described in the 
video location information. . 
[0158] The motion information can be used for calcu- 
lating the display time of the video described in the video 
location information as used at the time of calculating 
the display time from the motion of the video, as shown 
in FIG 22. In this case, even when the display time in- 
formation is omitted and only the motion information is 
described, special reproduction such as high-speed re- 
production can be conducted In the same manner as in 
the case in which the display time is described. In this 
case, the display time is calculated at the time of repro- 
duction. ... -j «,= ™ 
[01 59] Both the display time information and the mo- 
tion Information can be described at the same time. In 
thatcase, an application for displaying usesth^q^ed 
-one oTTfieTwoToTTises ISofflTn coTnBinaTion in accord- 
ance with the processing. 

r01601 For example, the display time calculated irre- 
spective of the motion is described in the display time 
information. A method for calculating the display time 
for cutting out important scenes from the original video 
corresponds to this. At the time of the high-speed repro- 
duction of the summarized contents calculated In this 
manner, the motion Information is used so that a portion 
with a large motion is reproduced slowly while a portion 
with asmal) motion is reproduced quickly with the result 
that a high-speed reproduction free from a large over- 
look is enabled. 

[0161] The interest region information is used when 
the particularly interest region is present in the video de- 
soribed^theVideolocation-information^Whe-frameHn^ 

fonTiationrFofexahiple, feces of persons who seem to 

be Important correspondto this. At the time of displaying 
the video including such Interest region information, the 
display may be conducted by overlapping a square 
frame so that the interest region can be easily detected, 
The frame display Is not indispensable, and the video 
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may only be displayed as it is. 
[01 62] The interest region information can be used for 
processing and displaying the special reproduction con- 
trol information such as frame information or the like. 
For example, if a part of the frame information is repro- 
duced and displayed, the frame information including 
the interest region information is displayed with priority. 
Further, it is assumed that the frame information includ- 
ing square area with large area has higher importance, 
thereby making it possible to selectively displaying he 
video. 

[01 63] As shown above, there has been explained an 
example in which the processing is conducted on the 
basis of the scene change quantity. Hereinafter, there 
will be explained a case in which the importance infor- 
mation is used. 

[01 64] FIG. 34 is a view showing examples of a data 
structure of the frame information attached to the video. 
[0155] An importance information 122 is described in 
addition to or in place of the display time control infor- 
mation 1.02 in the data structure of the frame information 
of FIG. 1 . The display time is calculated based on the 
importance information 122. 

[0166] The importance Information 122 represents 
the importance of the corresponding frame (or a set of 
frames). The importance is represented, for example, 
as an Integer in a constant range (for example, 0 to 1 00), 
or is represented as an actual number in a constant 
range (for example, 0 to 1). Otherwise, the importance 
information 122 may be represented as an integer or an 
actual number value without setting the upper limit. The 
importance information 122 may be attached to ail the 
frames of the video, or only the frame In which the im- 
portance is changed. 

[0167] in this case as well, it is possible to take any 
form of FIGS. 9, 10, 12 , and 13. The frame extraction 
method of FIGS. 11, 20, and il can^eTjs^drTrTSIs^ 
case, the scene change quantity of FIGS. 11 , 20, and 
21 may be replaced by the importance. 
[01 68] Next, in the example which has been explained 
above, the display time is set with the scene change 
quantity. However, the display time may be set by the 
importance information. Hereinafter, the method for set- 
ting the display time will be explained. 
[0169] In the setting the display lime on the basis of 
the scene change quantity exemplified above in order 
lo understand the video contents welt, the display time 
is set long where the change quantity is large and the 
display time is set short where the change quantity is 
small. In the setting of the display time on the basis of 
this importance, the display time is set long where the 
.impoTfanceriO 

where the importance is low. That is, since the method 
7f or^"^lrrcf the~dlspla^ 

is basically similar to the method for setting the display 
time based on the scene change quantity, the method 
will be briefly explained. 

[0170] FIG. 36 shows one example of the basic 



processing procedure In this case. 
[0171] At step S191 , the importance of all frames of 
the original video will be calculated. A concrete method 
thereof will be exemplified later. 
5 [0172] At step S192, the lota! of the importance from 
the description object frame described in the video lo- 
cation information to the next description object frame 
will be calculated. 

[0173] FIG. 37 describes the change in the impor- 
10 tance for each one frame. Reference numeral 2200 de- 
notes the importance. If Ihe display time of the video 
having the location information of the frame information 
Fj is calculated, the importance in the section up to F l+1 
which is the next description object frame location is ac- 
ts cumulated. The accumulation result is an area S'j of the 
hatching portion 2202. 

[01 74] At step S1 93, the display time of each frame is 
calculated, Suppose that the ratio of the display time al- 
located lo the video at each frame location F f the repro- 
ve duction time is set to S'/XS j. When the total of the re- 
production time is setto T, the display time of each video 
becomes Dj = T x S'/S j. The value of the total T of the 
reproduction time is a standard reproduction time to be 
regulated as the total reproduction time of the original 
25 video. • - - 

[0175] When the total of the importance becomes S' t 
= 0, the preset tower limit value (for example, 1 ) may be 
described, or the frame information may not be de- 
scribed. Even if S 1 , = 0 is not established but the impor- 
30 tance Is very small, and it is assumed that such a frame 
is virtually not displayed, the lower limit value may be 
described or the frame information may not be de- 
scribed. If the frame, information Is not described, the S'j 
value may be added and may not be added to S' M . 
35 [0176] As shown in FIG. 34, in the data structure of 
the frame information of FIG. 1 , the video location infor- 
maWfl G TTtlielispIay Time ifif ormati o n~1 2 1"anti tirelm- 
portance information 112 may be described in each 
frame information "i". At the tjme of the special repro- 
40 duction, the display time information 121 is used but the 
. importance information 122 is not used; the importance 
information 122 is used but the display time information 
121 is not used; both the importance information 122 
and the display time information 121 are used; and nei- 
45 therthe importance information 122 northe display time 
information 121 Is used. 

[0177] The processing of calculating the display lime 
can be conducted for preparing the frame information 
with the special reproduction control information creat- 
50 |ng apparatus. However, the processing may be con- 
ducted on the side of the video reproduction apparatus 

^3at-the-time-of4hespecial-reproduction 

[0178] " NiaxtTa ^method (for ex'ahipleVsfep'Sl^f ofFIGT 

^6)-for^lculating-theHmportance-of-eaeh-frame-or--the 

55 scene (video frame section) will be explained. 

[0179] Since various factors are normally intertwined 
in the judgment as to a certain scene having a video is 
important, the most appropriate method for calculating 
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the importance is a method In which man determines 
the importance. In this method, importance avaluator 
evaluates the importance for each scene or the video, 
or tor each of the constant interval, so that the impor- 
tance is input as the importance data. The importance 
data referred lo here refer to a frame number or lime 
and a correspondence lable with the importance value. 
In order to avoid subjective evaluation of importance, a 
plurality of importance evalualors are allowed to evalu- 
ate the same video to calculate the average value (or a 
median or the like will do) for each scene or each video 
frame section so that the importance is finally detei- 
mined. in such manual Input of the importance data, it 
is possible to add vague expressions and a plurality of 
elements which cannot be expressed In words to the Im- 
portance. 

[01 80] In order to omit the trouble of determination Dy 
man, it Is preferable that a phenomenon is expected in 
which a video scene which seems lo be Important is like- 
ly to appear, and the processing is used which automat- 
ically evaluates such phenomenon to convert the phe- 
nomenon into importance. Here, some examples are 
shown in which importance is automatically created. 
[01 81] FIG. 38 shows an example of aproccssing pro- 
cedure at the time of automatically calculating important 
data on the basis of the Idea that a scene having a large 
sound level is important. FIG. 38 is established as a 
function block diagram. 

[0182] In the sound level calculation processing at 
step S21 0 the sound level at each time is calculated out 
when the sound level attachedtothevideo is calculated. 
Since the sound level largely changes in an instant, the 
smoothing processing or the like may be conducted in 
the sound level calculation processing at step S21 0. 
[0183] In the importance calculation processing at 
step S211 , a processing Is conducted for converting into_ 
-TBTinpHnHni^rS^ W»nS^aB a resul ofThe 
sound level calculation processing. For example, the 
sound level input is linearly converted into a value of 0 
to 100 the sound level having the lowest sound level 
set in advance being set to 0, and having the highest 
sound level being set to 1 00. The sound level not more 
than the lowest sound level is set to 0 while the sound 
level not less than the highest sound level is set to 1 00. 
As a result of the Importance calculation processing, the 
importance at each time Is calculated to be output as 
importance data. 

[0184] FIG. 39 shows anexample of a processing pro- 
cedure of a method for automatically calculating another 
importance level. FIG. 39 is established as a function 

block diagram. . 
-[01 85] _ln-processingof-FIG_.^9H«s-<ieterrnined that 

-'—the scene in wh'ichlmporteht words registered in ad- 
vance in the sound attached to the video quite often ap- 
pear is important. 

[0188] In the sound recognition processing at step 
S220 when the sound data attached to the video is in- 
put, the language (words) man talks is converted into 
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text data in the sound recognition processing. 
r0187] In the important word dictionary 221, words 
which are likely lo appear in important scenes are reg- 
istered. If the degree of importance of registered words 
differs, the weight is added lo each of the registered 

mi'sll In the word collation processing at step S222, 
the text data which is an output of the sound recognition 
processing is collated with the words registered it . he 
important word dictionary 221 to determine whether or 
not important words are talked, 
roi89] In the importance calculation processing at 
step S223, the importance in each scene of the video 
or at each time is calculated from the result of the word 
collation processing. In this calculation, the number o 
the appearances of important words and the we.ght of 
the important words are used so that the processing | Is 
conducted to increase the importance around the time 
at which, lor example, important words have appeared 
(or of the scene in which the important words have ap- 
peared) by a constant value, or a value proportional to 
the weight of the important words. As a result ol the im- 
portant catenation processing, the importance at each 
time is calculated to be output as importance data. 
[0190] If the weight of all the words is set to the same, 
the Important word dictionary 221 becomes unneces- 
sary. This is because that it is assumed that the scene 
in which many words are spoken is important At this 
time in the word collation processing at step S222 tne 
30 processingofcountingthenumberofwordsoutputfrom 
the sound recognition processing is conducted. Not orty 
the number of words but also the number of characters 
may be counted. 

[01 91 ] FIG 40 shows an example of a processing pro- 
cedure of the method for automatically calculating the 
other importance tevel^FIG^ jsalso established^ _ 

WicfiohlirocOiagram. thot+ h= 
[0192] The processing of FIG. 40 determines thatthe 

scene in which many important words appear which are 
registered in advance in the telop appearing in the video 

roSrTthe.telop recognition processing at step 
S230, the character location in the video Is specified to 
recognize characters by converting the video region at 
the character location Into a binary value. The recog- 
nized result Is output as text data. 
[01 94] The important word dictionary 231 is the same 
as the important word dictionary 221 of FIG. 39. 
T01 951 In the word collation processing at step S23Z, 
fn the same manner as at step S222 in the 
FIG 39, the text data which is an oulput of the telop 

reo ognltion-processing-is collated I wlthfte-w^egle^ 

tered in the important word dictionary 231 to determine 

whether or not important words have appeared. 
[0196] In the importance calculation processing at 
step S232,the importance at each scene or at each time 
is calculated from the number of appearances of impor- 
tant words, and weight of the important words in the 
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same manner as a1 step S223 in the procedure of FIG. 
39. As a result of the Importance calculation processing, 
the Importance al each time is determined to be output 
as Importance data. 

[01 97] If the weight of all the words is set to the same, 5 
the important word dictionary 231 becomes unneces- 
sary. This is because that it is assumed that the scene 
in which many Important words appear is an important 
scene, At this time, in the word collation processing at 
step S232, processing is conducted tor counting the 10 
number of words simply output from the telop recogni- 
tion processing. Not only the number of words but also 
the number of characters may be counted. 
[0198] FIG. 41 shows an example of a processing pro- 
cedure of a method for automatically calculating still an- *s 
other importance level. FIG. 41 is established as a func- 
tion block diagram. 

[0199] The processing of FIG. 41 determines that 
when the telop appearing in the video is in larger char- 
acter size, the scene is more important. so 
[0200] In the telop detection processing at step S240, 
the processing is conducted for specifying the location 
of character string in the video. 
[0201] In the character size calculation processing at 
step S241 , individual characters are extracted to calcu- 25 
late the average value or the maximum value of the size 
(area) of the character. 

[0202] In the importance calculation processing at 
step S242, the Importance is calculated which is propor- 
tional to the size of the character which is an output of 30 
the character size calculation processing. If the calcu- 
lated importance is too largeortoo small, the processing 
is conducted for restricting the importance to a preset 
range with the threshold value processing. As e result 
of the importance calculation processing, the impor- 35 
tenpeat eachtimejs^alculated to b e output a s impor- 
tance data. ~~ 
[0203] FIG. 42 shows an example of the processing 
procedure of a method for automatically calculating still 
another importance level. FIG. 42 is established as a -*o 
function block diagram. 

[0204] The processing of FIG. 42 determines that the 
scene in which human faces appear in the video is im- 
portant. 

[0205] In the face detection processing at step S250, *s 
the processing is conducted for detecting an area which 
looks like a human face in the video. As a result of the 
processing, the number of areas (number of faces) 
which are determined to be a human face is output. The 
information on the size (area) of the face may be output 50 
at the same time. 

step S251 , the number of faces which is an output of the 
~p76cessinS75f'dm^ 

times to calculate the importance. If the output of the 
face detection processing includes face size informa- 
tion, calculation is conducted so that the importance in- 
creases with an increase in the size of faces. For exam- 



ple, the area of the face is multiplied by several times 1o 
calculaie the importance. As a result of the importance 
calculation processing, the importance at each time Is 
calculated to be output as importance data. 
[0207] FIG. 43 shows an example of the processing 
procedure of a method for automatically calculating still 
other importance level. FIG. 43 is also established as a 
function block diagram. 

[0208] In the processing of FIG. 43, It is determined 
that the scene in which a video similar to the video which 
is registered in advance appears is important. 
[0209] The video which should be determined to be 
Important is registered in the important scene dictionary 
260. The video is recorded as raw data or is recorded 
in a data compressed form. Instead of the video Itself, 
the characteristic quantity (a color histogram, a frequen- 
cy or the like) of the video may be recorded. 
[0210] In the slmiiarity/non-slmilarity calculation 
processing at step S261, simllarity/non -similarity be- 
tween the video registered in the Important scene dic- 
tionary 260 and the input video data is calculated. As 
the non-similarity, the total of the square error or the total 
of the difference in the absolute value is used. If the vid- 
eo data is recorded in the important scene dictionary 
260, the total of the square error for each of tha.corre- 
sponding pixels andthetotal of the differential of the ab- 
solute valued are calculated as non-similarity. If the 
color histogram of the video is recorded in the important 
scene dictionary 260, the same color histogram is cal- 
culated with respect to the input video data to calculate 
the total of the square error between histograms and the 
total of the difference in the absolute values to set these 
totals as non-similarity. 

[0211] In the importance calculation processing at a 
step S262, the importance is calculated from the simi- 
larity/ non-similarity which is an output of the similarity 
"and" hl)h^imflaTity~ca^ 
tance is cabulated in such a mannerthat larger similarity 
provides greater importance if the similarity is input 
while larger non-similarity provides smaller importance 
If the non-similarity is input. As a result of the importance 
calculation processing, the importance at each time is 
calculated to be output as the importance data. 
[0212] Furthermore, as another method for automat- 
ically calculating the Importance, the scene having a 
high instant viewing rate is set as an important scene. 
The data on the instant viewing rate is obtained as a 
result of the summing of the viewing rate investigation, 
so that importance is calculated by multiplying the in- 
stant viewing rate by constant times. Needless to say, 
there are various other methods. 
-[0213] — Theimportance-dalGulationprdcessing.may±)fi_ 
solely conducted, o~r a plurality "of data" it ems "may be" 
-used-at-the-same-time-to-calGUlate-the-importance^ln- 
the latter case, for example , the importance of one video 
is calculated with several different methods to calculate 
the final importance as an average value or a maximum 
vaJue. 
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ro214l In the above embodiment , Ihe explanation has 
been giv n by citing the scene change quantity and the 

change quantity or Ihe importance or Instead of the 
«?csne chanqe quantity or importance. 
fSl Sl Next, there will be explained a case in which 
& "for the control of reproduction/non-rep - 
Sn is added to the frame information (see FIG. 1). 
Si It is desired that either only a specif* scene or 
a part thereof (.or example, a high-light scene) or on* 
a scene or a part thereof in which a s P ec,f,c person ap- 
pears Is repmduced.Thus, there is a demand of watch- 

non reproduction' information may be added to the 

produced on the basis of the reproduction/ non-repro- 
!SS in FlS a 44 n 45 and46showexamp.esc.adata 

ormation 123 is addedto thedata structure tn&M. 
FIG. 46 shows a data structure in which fhjjj 
tion/non-reproduction information 123 is added to the 
Sa stmSre of FIG, 35. Though not shown, .t m pos- 
aSdthe repmduction/non-reproduction Atom* 



SST -ra^ case, when the re- 

Sroe of reproduction, the video is reproduced. When the 
.eve. is tess than the threshold value he 
l is not reproduced. The user can direct* or indi- 
rsctlv specify the threshold value. ,„,„,_,- 
Si The reproductlon/non-reproductlon mforma- 
K 123 mav be set as independent Informauon to be 

^specified, the non-reproduction can be speoffied 
when the display time shown in the d,splay time infer 

KttTLl v*e. the dMV time 1 2' 



„dto lite W » *n»*« I * «2* S """' * 
quantity of data ina y ^ non . repro . 

portion. If the repre inde p Bn dent information, 

some ™*"°"»-™ ; e , , 0 P a value or reore, 

r s.° sssss. ««- «- m r trt 

7"™ ,, „, .« » a constant value or more, the 
0 '^"'^";lSooo t b..o„ l yb«reo.l=npo rt i» 

P^.^ioontrollsoartedoolsothatvido. 
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r^rrSSrepre-— -re- 

vldeoframegroup located 
4. play time information -P^ed J*** 

repru . ln +h i- ovamDle, the sections ofD-j, u 2 , 

proboollon, e»<< to. display w» - •* ° CZam 
the non-reproduction, L ( u i - » 
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the reproduction portion of the original video is set to T. 
Normally, the display time of D + | is set to a time which 
is required to reproduce the original video at a normal 
speed. The reproduction speed may be set to a prede- 
termined high-speed. Information may be described as 
to which times the speed Is to be set. When it is desired 
that the video is reproduced at N times high-speed, the 
display time D+, of the reproduction portion is multiplied 
by 1/N times. For example, in order to perform repro- 
duction at the predetermined time D\ the display time 
D*,- ol each reproduction portion may be processed and 
displayed at D7S,D + j times. 

[0229] If the display time of each frame (or a frame 
group) is determined on the basis of the frame informa- 
tion, the determined display time may be adjusted. 
[0230] In a method In which the calculated display 
time is not adjusted, the display time which is calculated 
wlihoui taking into consideration the generation of the 
non- reproduction section is used as II is, so thai when 
the display time exceeding 0 is originally allocated to the 
non-reproduction section the whole display time is 
shortened for that allocation portion. 
[0231] In a method in which the calculated display 
lime is adjusted, for example, if the display time exceed- 
ing 0 is originally allocated to the n on -rep reduction sec- 
lion, the adjustment is made by multiplying by a constant 
number the display time of each of the frames (or the 
frame group) to be reproduced so that the whole display 
lime becomes equal to the time at the time of the repro- 
duction of the non-reproduction section. 
[0232] The user may make a selection as to whether 
the adjustment is to be made. 
[0233] If the user specifies the N times reproduction, 
the N times high-speed reproduction processing may be 
conducted without the adjustment of the calculated dis- 
.^iay— tjme^_The__!i.Jim 



processing may be conducted on the basis of the display 
time after the adjustment of the calculated display time 
in the above manner (the display time of the former be- 
comes shorter). 

[0234] The user may specify the whole display time. 
1 n this case as well, for example, the display time of each 
frame (or a frame group) to be reproduced is multiplied 
by a constant numberto make an adjustment so that the 
display time becomes equal to the specified whole dis- 
play time. 

[0235] FIG. 48 shows one example or the processing 
procedure for reproducing only a portion of the video on 
the basis of the reproduction/non-reproduction informa- 
tion 123. 

[0236] At step S1 62, the frame inlormation {video lo- 

"r^afioTTinfor^ 
to determine whether the frame is to be reproduced from 

' Ifie^pfbWcTi^^^ 
display time information at step S 1 63. 
[0237] When it is determined that the reproduction is 
to be conducted, the frame is displayed for the portion 
of the display time at step S164. When it is determined 



that the reproduction Is not to be conducted, the frame 
is not displayed and the processing is moved to the next 
frame processing. 

[0238] It is determined at step S161 whether or not 
5 the whole video to be reproduced is processed. When 
the whole video is processed, the reproduction process- 
ing is also ended. 

[0239] When it is determined that the frame is to be 
reproduced or not at step S163, it is desired in some 
10 cases that the determination is depending on the taste 
of the user. At this time, it is determined from the user 
profile whether or not the non-reproduction portion is re- 
produced in advance before the reproduction of the vid- 
eo. When the non-reproduction portion is reproduced, 
15 the frame is reproduced without fall at step S1 64. 
[0240] in addition, when the reproduction/non-repro- 
duction information is described as a continuous value, 
a threshold value is determined from the user profile for 
differentiating the reproduction and the non-reproduc- 
20 tion to determine the reproduction or the non- reproduc- 
tion depending on whether or not the reproduction/non- 
reproduction information exceeds the threshold value. 
Except for using the user profile, for example, the 
threshold value is calculated from the importance set for 
25 ©ach frame, or information may be received in advance 
from the user as to whether the reproduction or non-re- 
production is provided in real time. 
[0241] In this manner, it becomes possible to repro- 
duce only a portion of the video by adding to the frame 
3D information the reproduction/non-reproduction informa- 
tion 123 for controlling whether the video is reproduced 
or not with the result that it becomes possible to repro- 
duce only the high-light scene or only the scene in which 
a man or an object of interest appears, 
35 [0242] Next, there will be explained a describing 
method if the location information of media (for example, 
text bTsouricJ) ""oTfierThan fhevldeo associated wlthlh'e - 
video to be displayed, and time for displaying or repro- 
ducing the video is added to the frame information (see 
40 FIG. 1) as additional information. 

[0243] in FIG. 8, the video location information 101 
and the display time information 102 are included in 
each frame information 100. in FIG. 34, the video loca- 
tion information 101 and importance information 122 are 
45 included In each frame information 100. In FIG. 35, the 
video location information 1 01 , the display time informa- 
tion 121, and importance information 122 are included 
In each frame information 1 00. In FIGS. 44, 45, and 46, 
there is further shown an example in which the repro- 
50 ducti on/non-rep reduction information 123 is included in 
each frame information 1O0. In any example, 0 or more 
_ oundHoeatioh-infonmation^ 
time information 27047 Oormo re text information 2705" 

and^xt-display-time^nformation-2706-(howeveiv-1-or- 

55 more in any of the information) may be added. 

[0244] FIG. 49 shows an example in which one set of 
sound location information 2703 and sound reproduc- 
tion time information 2704 and N sets of text information 
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2705 and texl display lime information 2706 are added 
lo an example of the data structure of FIG. 8. 
[0245] The sound is reproduced for the Lime indicated 
by the sound reproduction time information 2704 from 
the location indicated by the sound location Information 
2703, An object of reproduction may be sound Informa- 
tion attached to the video from the beginning. Back- 
ground music is created to be newly added. 
[0246] The text displays the text information indicated 
by the text information 2705 for the time indicated by the 
text display time information 2706. A plurality of items 
of text information may be added to one video frame. 
[0247] The time when the sound reproduction and the 
text display are started is the same as the time when the 
associated video frame is displayed. The sound repro- 
duction time and the text djsplay time are set within the 
range of the associated video frame time. If continuous 
sound Is reproduced over a plurality of video frames, the 
sound locaLion information and the reproduction lime 
may be set to be continuous. 

[0248] With such a method, summarized sound and 
summarized text can be made possible. 
[0249] FIG. 50 shows one example of a method for 
describing tho sound information separately from the 
frame information. This is an example of a data structure 
for reproducing sound associated with the video frame 
which is displayed at the time when the special repro- 
duction is conducted. A set of the location information 
2801 showing the location of the sound to be repro- 
duced, reproduction start time 2B02 when the sound re- 
production is started, and reproduction time 2803 when 
the reproduction is continued is set as one item of sound 
information 2800 to be described as an arrangement of 
this sound information. 

[0250] FIG. 51 shows a data structure for describing 
the text information. The data structure has the same 
"structo re"B^hTEfsTa™idlntormfft iorrol1 r IGr50r^nd"a"ser 
of character code location Information 2901 of the text 
to be displayed, a display start time 2902, and a display 
time 2903 is set as one item of text Information 2900 to 
be described as an arrangement of this sound informa- 
tion. As information corresponding to the character code 
location Information 2901 , instead of the character code 
location information 2901 , the location information may 
be used which indicates a location where the character 
code is stored, or a location where the character is 
stored as a video. 

[0251] The above sound information or the text infor- 
mation is synchronized with the display of the video 
frame to be displayed as information associated with the 
video frame or a constant video frame section In which 
4he-displayed-video4rame-is-present^Asshown-in-RG.- 
'52, the reproduction or the display "of the 'sound infor- 
mation or the text information is started with the lapse 
of time shown by the time axis 3001 . In the beginning, 
the video 3002 Is displayed and reproduced for the de- 
scribed display time in an order in which the respective 
video frames are described. Reference numerals 3005, 



3006 and 3007 denote respective video frames and a 
predetermined display time is allocated thereto, The 
sound 3003 is reproduced when the reproduction start 
time described in each sound information comes. When 

5 the reproduction time described in a similar manner has 
passed away, the reproduction is suspended. As shown 
in FIG, 52, a plurality of sounds 3008 and 3009 may be 
reproduced. In a similar manner as the sound, the texl 
3004 Is also displayed when the display time described 

10 in the each of the text information comes. When the dis- 
play time which is described has passed away, the dis- 
play is suspended. A plurality of texts 3010 and 3011 
may be displayed at the same time. 
[0252] It is nol required that the sound reproduction 

15 start time and the toxt display start time coincides with 
the time at which the video frame is displayed. It is not 
required that the sound reproduction time and the text 
display time coincides with the display time of the video 
frame. These limes can be freely set, on Lhe contrary, 

20 the display time of the video frame may be changed in 
accordance with the sound reproduction time and the 
text display time. 

[0253] It is possible that these times can be manually 
set by man. 

25 [0254] In orderto omit the trouble of determination by 
man, it is preferable to determine a phenomenon which 
is likely to appear in the video scene which seems to be 
important and to automatically set these times. Herein- 
after, several examples of automatic setting are shown. 

30 [0255] FIG. 53 shows one example of a processing 
procedure in which a continuous video frame section is 
determined which is referred to as a shot from a change- 
over of the screen up to the next change-over of the 
screen, so that the total of the display time of the video 

35 frames included in the shot is defined as the sound re- 
production time. FIG. 53 is also established as a function 
blocTdiagram. 

[0256] At step S3101 , the shot is detected from the 
video. For this purpose, there are used such methods 

40 as a method for detecting a cut of a motion picture from 
the MPEG bit streams using a tolerance ratio detection 
method. (The transactions of the institute of electronics, 
information and communication engineers, Vol. J82-D- 
II, No. 3, pp. 361-370, 1999) and the like. 

45 [0257] At step S31 02, the video frame location infor- 
mation is referred tothereby investigating which shot re- 
spective video frames belong to. Furthermore, the dis- 
play times of respective shots are calculated by taking 
the total of the display times of the video frames. 

so [0258] For example, the sound location information is 
set as the sound location corresponding to the start of 
— ^he^hotrT^e-soundTeproduction-starttime-may-be-al 
iowedto coincide with the display time of the initial video 
frame which belongs to each shot while the sound re- 

55 production time may be set to be equal to the display 
time of the shot. Otherwise, in accordance with the re- 
production time of the sound, the display time of the vid- 
eo frames included in each shot maybe corrected. Al- 
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though the shot is detected here, if a data structure is 
taken wherein the importance information is described 
in the frame information, the section having importance 
exceeding the threshold value is determined by using 
the importance with respect to the video frame so that s 
the sound included in the section may be reproduced. 
[0259] If the determined reproduction time does not 
meet a constant reference, the sound may not be repro- 
duced. 

.[0260] FIG. 54 shows one example of a processing 10 
procedure in which important words are taken out from 
sound data corresponding to the shot or the video frame 
section having the high importance with sound recogni- 
tion so that the words, or the sound including the words 
orthesound in which a plurality of words are combined *s 
are reproduced. FIG. 54 is also established as a function 
block diagram, 

[0261] At step S3201 , the shot Is detected. In place of 
the shot, the video frame section having the high impor- 
tance is calculated. so 
[0262] At step S3202, the sound recognition is carried 
out with respect tD the sound data section correspond- 
ing to the obtained video frame section. 
[0263] At step S3203, sounds including the important 
word portion or sounds of the important word portion are & 
determinedfrom the recognition result In orderto select 
the important words, an important word dictionary 3204 
is referred to. 

[0264] At step S3205, the sound for reproduction is 
created. Continuous sounds including the important 30 
words may be used as they are. Only important words 
may be extracted. Sounds having a combination of a 
plurality of important words may be created. 
[0265] At step S3206, in accordance with the repro- 
duction time of the created time, the display time of the 35 
video frame is corrected. However, the number of se- 
lected words may be decreas^d'iS^^ 
time of the sound may be shortened so that the sound 
reproduction time is set to be within the display time of 
the video frame. 

[0266] FIG. 55 shows one example of a procedure in 
which text information is obtained from the telop. FIG. 
55 is also established as a function block diagram. 
[0267] In the processing of FIG. 55, the text informa- 
tion is obtained from the telop or the sound displayed in 
the video. 

[0268] At step S3301 , the ielop displayed in the video 
is read. This includes a method in which the telop in the 
original video Is automatically extracted or the telop is 
read by man to be manually input with a method or the 
like described in, for example, a method described in a 
/!fteratuf^euch:as^ extractingthecharaeter- 
portion from the video for the telop region" by Osamu 

Ift^CVIM^ 

[0269] A step S3302, Important words are taken out 
from the telop character string which has been read. In 
the judgment of important words, an important word dic- 
tionary 3303 is used. The telop character string which 
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is read may be text information as it is. Extracted words 
are arranged, and a sentence representing the video 
frame section may be constituted with only the important 
words to provide text information. 
[0270] FIG. 56 shows one example for obtaining the 
text information Irom the sound. FIG. 56 is also estab- 
lished as a function block diagram. 
[0271] In the sound recognition processing at step 
S3401, sound is recognized. 

[0272] At step S3402, important words are taken out 
from the recognized sound data. In the judgment of im- 
portant words, an important word dictionary 3403 is 
used. The recognized sound data may be used as test 
information. Extracted words are arranged, and a sen- 
tence is constituted which represents the video frame 
section with only the important words to provide text in- 
formation. 

[0273] FIG. 57 shows an example of processing pro- 
cedure for taking out text information and preparing the 
text information with telop recognition from the shot or 
from the video frame section having high importance. 
FIG. 57 is also established as a function block diagram. 
[0274] At step S3501 , the shot is detected from the 
video. Instead of the shot, the section having high im- 
portance may be determined. 

[0275] Atstep S3502, the telop represented. in the vid- 
eo frame section is recognized. 
[0276] At step S3503, the important words are ex- 
tracted by using an important word dictionary 3504. 
[0277] At step S3505, text for the display is created. 
For this purpose, a telop character string including im- 
portant words may be used. Only important words or a 
character string using the Important words may be used 
as text information. If text information is obtained by 
sound recognition, the telop recognition processing at 
step S3502 is subjected to sound recognition process- 
"Tnglolhplifsoi^ 
together with the video frame In which the text is dis- 
played as telop or video frame of the time at which the 
data is reproduced as sound. Otherwise, text informa- 
tion in the video frame section may be displayed at one 
time. 

[0278] FIGS. 58A and 58B are views showing a dis- 
play example of the text information. As shown in FIG. 
58A, the display may be divided Into the text Information 
display area 3601 and the video display area 3602. As . 
shown in FIG. 58B, the text Information may be over- 
lapped with the video display area 3603. 
[0279] Respective display times (reproduction times) 
of the video frame, the sound information and the text 
information maybe adjusted so that all the media infor- 
ation is~synchronizedr^or-Qxampic f -aUhe-time-of-the- 



double speed reproduction of the video* "important 
sounds-ar-e-extracted-by-the-above.method n and^a-half. 
time sound information of the normal reproduction is ob- 
tained. Next, the display time is allocated to the video 
frame associated with respective sounds. If the display 
time of the video frame is determined so that the scene 
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change quantity becomes constant, the sound repro- 
duction time or the text display time is set to be within 
the display time of the respectively associated video 
frames. Otherwise, a section including a plurality of vid- 
eo frames is determined like the shot, so that the sound 
or the text included in the section Is determined or dis- 
played in accordance with the display time of the sec- 
lion. 

[0280] So far there has been explained video data as 
its main focus. However, the data structure or the 
present invention can he modified to a data having no 
frame information, i.e., the sound data. It is possible to 
use sound information and text information in the form 
without the frame information. In this case, a summary 
is created which comprises only sound information or 
text inlormation with respect lo the original video data. 
In addition, a summary can be created which comprises 
only sound information and text information with respect 
lo the sound data and music data. 
[0281] Though the data structures shown in FIGS. 50 
and 51 are used to describe the sound information and 
lext information in synchronization with the video data, 
it is possible to summarize the sound data and text data 
only. To summarize the sound data, the data structure 
shown in FIG. 50 can be used irrespective of the video 
information. To summarize the text data, the data struc- 
ture shown in FIG. 51 can be used irrespective of the 
video information. At that time, in the same manner as 
in the case of the frame information, the original data 
information may be added to describe a correspond- 
ence relationship between the original sound and music 
data to the sound information and text information. 
[0282] FIG. 59 shows an example of a data structure 
in which the original data information 4901 is included 
in the sound information shown in FIG. 50, If the original 
data is the video, the original data information 4901 in- 
— -dicates-trvs sectic-Trof -video frames^staTtpolntinf crma^ 
tion 4902 and section length information 4903). 
[0283] If the original data is sound data and music da- 
ta. the original data information 4901 indicates the sec- 
tion of sound and music. 

[0284] FIG, 60 shows an example of a data structure 
in which the original data information 4901 is included 
in the sound information shown in FIG. 30. 
[0285] FIG. 61 explains an example in which sound/ 
music is summarized by using the sound information. 
The original sound/music is divided into several sec- 
tions, A portion of the section is extracted as the sum- 
marized sound/music so that the summary of the origi- 
nal data is created. For example, a portion 5001 of the 
section 2 is extracted as summarized sound/music to 
bej:eprojducedBs.aserfiDn^0^ofJaasummary..As^an. 



information and the lext information with the resull that 
a plurality of sound/music data Hems can be summa- 
rized togelher. At this time, if identification information 
is added to the individual original data, the original data 

5 identification information may be described in place of 
the original data file and the section. 
[0287] FIG, 62 explains an example in which sound/ 
music is summarized by using the sound information. 
Portions of plural sound/music data items are extracted 

io as the summarized sound/music so that the summary 
of the original data is created. For example, a portion 
5001 of the sound/music data item 2 is extracted as 
summarized sound/music to be reproduced as a section 
5102 ol the summary. A piece of music included in one 

15 music album Is extracted by a portion of the section, so 
that a summarized data for trial can be created as a us- 
age. 

[0288] if an album Is summarized, the title of the music 
may be included in the music information when It is pref- 
20 erable that the title of the music can be known. This in- 
formation is not indispensable. 
[0289] Next, a method of providing video data will be 
explained. 

[0290] If the special reproduction control information 
25 created in the processing of the embodiment is provided 
for the use, It is necessary to provide the special repro- 
duction control information from the side of those who 
create the Information to the side of the user with some 
means. As this method of providing the special repro- 
30 duction control information, various forms can be con- 
sidered as exemplified below: 

(1) Video data and special reproduction control in- 
formation are recorded on one (or a plurality of) re- 
35 cording medium (or media) and provided at the 
same time; 

{2) Video dataris-recorded won e (or-a-plurafity-of)- 

recording medium (or media) and provided, andthe 
special reproduction control information is sepa- 
40 rately recorded on one (or a plurality of) recording 
medium (media) and provided; 
(3) Video data andthe special reproduction control 
Information are provided via the communication 
medium at the same occasion; 
45 (4) Video data and the special reproduction control 
information are provided via the communication 
media at different occasions. 

[0291] According to the above described embodi- 
50 ments, a special reproduction control information de- 
scribing method for describing special reproduction con- 
^_^oUnfoiniation_p^ 



example of a method for dividing the section, the music 
may be divided into chapters and the conversation may 
be divided by the contents. 

[0286] Furthermore, in the same manner as in the 
case of the frame information, the description of the orig- 
inal data file and the section are included In the sound 



"respect fo'the video contents describes, "as the "frame " 
information, for each of frames or groups of continuous 
55 or adjacent frames selectively extracted from the whole 
frame series of video data constituting the video con- 
tents, first information showing a location at which video 
data of the one frame or one group is present and sec- 
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ond information associated with display time allocated 
to the one frame or the frame group, and/or third infor- 
mation showing importance allocated to the one frame 
or the frame group corresponding to the frame informa- 
tion, s 
[0292] According to the above described embodi- 
ments, a computer readable recording medium storing 
especial reproduction control information stores at least 
frame information described for each of frames or 
groups of continuous or adjacent frames selectively ex- 10 
tracted from the whole frame series of video data con- 
stituting the video contents, the frame information com- 
prising first information showing a location at which vid- 
eo data of the one frame or one group is present and 
second Information associated with display time allocat- *s 
ed to the one frame or the frame group, and/or third in- 
formation showing Importance allocated to the one 
frame or the frame group corresponding to the frame 
information. 

[0293] According to the above described embodi- 20 
ments, a special reproduction control information de- 
scribing apparatus/method for describing special repro- 
duction control information provided for special repro- 
duction with respect to the video contents describes, as 
the frame information, for each of frames or groups of 25 
continuous or adjacent frames selectively extracted 
from the whole frame series of video data constituting 
the video contents, video location information showing 
a location at which video date of the one frame or one 
group is present and display time control information in- 30 
eluding display time information and basic information 
based on which the display time is calculated, to be al- 
located to the one frame or the frame group. 
[0294) According to the above described embodi- 
ments, a special reproduction apparatus/method which 35 
enables a special reproductioji ^with respect to video 
contents, wherein special reproduction coWollnTorma- 
tion is referred to which includes at least frame informa- 
tion including video location Information showing a lo- 
cation at which one frame data or one frame group data 40 
is present which information is described for each of the 
frame groups comprising one frame selectively extract- 
ed out of the whole frame series of the video data allo- 
cated to the video contents and constituting the video 
contents or a plurality of continuous or adjacent frames; & 
the one frame data or the frame group data correspond- 
ing to each frame information is obtained on the basis 
of video location information included in the frame infor- 
mation while the display time which should be allocated 
to each frame information is determined on the basis of so 
display time control information included in at least each 
iMmclhfOTma^ 

rality of frames which is or are obtained is reproduced 

attfrexietermined'disptey-time-in-a predetermined-© rder 

thereby carrying out a special reproduction, 55 
[0295] In the above described embodiments, for ex- 
ample, image data is created in advance, which is ex- 
tracted in frame units from location information on an 



effective video frame or an original video which is used 
for display, and the video frame location information or 
information on the display time of the image data is cre- 
ated separately from the original video. Either video 
frames orthe image data extracted from the original vid- 
eo is continuously displayed on the basis of the display 
information so that a special reproduction such as a dou- 
ble speed reproduction, a trick reproduction, jump con- 
tinuous reproduction orthe (ike is enabled. 
[0296] In the double speed reproduction for confirm- 
ing the contents at a high speed, display time is deter- 
mined in advance in such a mannerthatthe display time 
is extended at a location where a motion of the scene is 
large while the display time is shortened at a location 
where the motion is small so that the change in the dis- 
play screen becomes constant as much as possible. Al- 
ternatively, the same effect can be obtained even when 
the location Information Is determined so that an interval 
of the extracted locaLion is made small al a location 
where a motion of the video frame or video data used 
for the display is large while the interval is made small 
at a location where the motion Is large. A reproduction 
speed control value may be created so that a double 
speed value or a reproduction time is provided which is 
designated by a user as a whole. A long video can be 
viewed at double speed reproduction, so that the video 
can be easily viewed in a short time, and the contents 
can be grasped in a short time. 
[0297] It is possible to reproduce videos so that im- 
portant locations are not overlooked by extending the 
display time at the important locations and shortening 
the display time at unimportant locations In accordance 
with the importance of the video, 
[0298] Only important locations may be efficiently re- 
produced by partially omitting a part of the video without 
displaying the whole video frame. 
~~ ]^2919]~AccbfdThg"t6 em&bdimems'ot The' present in- 
vention, an effective special reproduction is enabled on 
the basis of the control information on the reproduction 
side by arranging and describing as control information 
provided for a special reproduction of the video contents 
a plurality of frame information including, a method for 
obtaining a frame or a group of frames selectively ex- 
tracted from the original video, information on the dis- 
play time (absolute or relative value) allocated to the 
frame or the group of frames and information which 
forms the basis for obtaining the information on the dis- 
play time. 

[0300] For example, each of the above functions can 
be realized as software. The above embodiments can 
be realized as a computer readable recording medium 
—on which_a program4s-reoorded for allowing the-compu= 
ter to conduclpredeterminecfmeans or for allowing the" 
— computer-to function-as-predetermined means ,-orforaU 
lowing the computer to realize a predetermined function. 
[0301] The structures shown in each of the embodi- 
ments are one example, and are not intended to exclude 
other structures. It is also possible to provide a structure 
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which is obtained by replacing a part of the structure 
exemplified above with anotherslructure, omitting a part 
of the exemplified structure, adding a different function 
to the exemplified structure, and combining such meas- 
ures. A different structure logically equivalent to the ex- 
emplified structure, a different structure including a part 
logically equivalent to the exemplified structure, and a 
different struclure logically equivalent to the essential 
portion of the exemplified structure can be provided. An- 
other structure identical to of similar to the exemplified 
structure, or a different structure having the same effect 
as the exemplified structure or a similar effect can be 
provided, 

[0302] In each of the embodiments, various variations 
with respect to various structure components can be put 
into practice in an appropriate combination, 
[0303] Each of the embodiments includes or inherent- 
ly contains an invention associated with various view- 
points, stages, concept or a category such as, for ex- 
ample, an invention as a method for describing informa- 
tion, an invention as Information which is described, an 
invention as an apparatus or a method corresponding 
thereto, an invention as an inside of the apparatus or a 
method corresponding thereto. 

[0304] Consequently, the invention can be extracted 
without being limited to the exemplified structure from 
the content disclosed in the embodiment according to 
this invention. 



Claims 

1 ? A method of describing frame information, the 
method characterized by comprising: 

describing, for a frame extracted from a plural-^ 

fiy^ WmeslrTa sWcWiaeo "dalaTTirstlrffor- 

mation (101) specifying a location of the ex- 
tracted frame in the source video data; and 
describing, for the extracted frame, second in- 
formation (102) relating to a display time of the 
extracted frame, 

2. The method according to claim 1 , characterized in 
that the extracted frame comprises a group of 
frames, and the first information comprises informa- 
tion specifying a location of the extracted group of 
frames in the source video data. 

3. The method according to claim 1 or 2, character- 
ized by further comprising describing, for the ex- 

tracted4rame,-third informational 2a)-relating4o-ln^ 



5. The method according to claim 1 , characterized in 
that the extracted frame comprises a rrame extract- 
ed from a plurality of frames included in a temporal 
section of the source video data, and further de- 

5 scribing fourth information specifying the temporal 
section of the source video data. 

6. The method according to claim 5, characterized In 
that the first information comprises information 

to . specifying an image data file created from the 
source video data of the extracied frame, the image 
data corresponding to the extracted frame. 

7. The method according to any one of the preceding 
15 claims, characterized in that the second informa- 
tion comprises information relating to such display 
time that a frame activity value during a special re- 
production Is kept substantially constant, 

20 6. The method according to any one of the preceding 
claims, characterized by further comprising de- 
scribing fifth information (123) indicating whether 
the extracted frame is reproduced or not 

25 g. The method according to claim 1 , characterized 
in that the first information comprises one of infor- 
mation specifying a location of the extracted frame 
among the plurality of frames and information spec- 
ifying a location of image data within an image data 
30 file created from the source video data and stored 
separately from the video data, the image data cor- 
responding to the extracted frame. 

10. The method according to any one of the preceding 
35 claims, characterized by further comprising de- 
scribing, for media data otherthan the source video 
"data" including The" , 'e)^c5^frame l Information 



"porta'nce of the extracted frame. 
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specifying a location of the media data and informa- 
tion relating to a display time of the media data. 

11. An article of manufacture comprising a computer 
usable medium storing frame information, theframe 
information characterized by comprising: 

first information (1 01 ), descrlbedf or a frame ex^ 
tracted from a plurality of frames, specifying a 
location of the extracted frame in the source 
• video data; and 
second information (102), described for the ex- 
tracted frame, relating to a display time of the 
extracted frame. 



4. The method according to claim 1 , 2 or 3, charac- 
terized in that the first information comprises infor- 
mation specifying an image data file created from 
the video data of the extracted frame. 



127 The article Tot manufacture according to claim 11, 
characterized inthatthe extracted frame compris- 
es es a group of frames, and the first information com- 
prises information specifying a location of the ex- 
• tracted group of frames in the source video data. 
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13. The Article of manufacture according 1o claim 11 or 
12, characterized in that the frame information 
comprises third informational 22) relating to impor- 
tance of the extracted frame. 

14. The article of manufacture according to claim 11, 12 
or 13, characterized in that the first information 
comprises Information specifying an image data file 
created from the video data of the extracted frame. 

15. The article of manufacture according to claim 11, 
characterized by further comprising storing the 
source video data and an image data file corre- 
sponding to the source video data of the extracted 
frame in addition to the frame information. 

16. An apparatus for creating frame Information, the ap- 
paratus characterized by comprising: 

a unit configured to extract a frame from a plu- 
rality of frames in a source video data; 
a unit configured to create the frame informa- 
tion including first information specifying a lo- 
cation of the extracted frame and second infor- 
mation relatingto a display time ofthe extracted 
frame; and 

a unit configured to link the extracted frame to 
the frame information. 

17. A method of creating frame information, the method 
characterized by comprising: 



extracting a frame from a plurality of frames in 
a source video data; and 
creating the frame information including first in- 
.IwnLBtion specifying ja location ofthe extracted 
frame in the source video data and second in : ~ 
formation relating to a display time of the ex- 
tracted frame. 



1 8. An apparatus for performing a special reproduction, 
characterized by comprising: 



a unit configured to refer to frame information 
described for a frame extracted from a plurality 
of frames in a source video data and including 
first information specifying a location of the ex- 
tracted frame in the source video data and sec- 
ond information relating to a display time ofthe 
extracted frame; 

a unit configured to obtain the video data cor- 
771^'plg^ 
first information; 

-a-unitconfiguredto-determine the-display-time - 
of the extracted frame based on the second in- 
formation; and 

a unit configured to display the obtained video 
data for the determined display time. 
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19. A method of performing a special reproduction 
characterized by comprising: 

referring to frame information described for a 
frame extracted from a plurality of frames in a 
source video data and including first informa- 
tion (101) specifying a location of the extracted 
frame and second information (102) relatingto 
a display time of the extracted frame; 
obtaining the video data corresponding to the 
extracted frame based Dn the first information; 
determining the display time of the extracted 
frame based on the second information; and 
displaying the obtained video data for the de- 
termined display time. 
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An article of manufacture comprising a computer 
usable medium having computer readable program 
code means embodied therein, the computer read- 
able program code means performing a special re- 
production, the computer readable program code 
means characterized by comprising: 



computer readable program code means for 
causing a computer to refer to frame informa- 
tion described for a frame extracted from a plu- 
rality of frames in a source video data and in- 
cluding first Information (101) specifying a loca- 
tion ofthe extracted frame and second informa- 
tion (102) relating to a display time of the ex- 
tracted frame; 

computer readable program code means for 
causing a computer to obtain the video data 
corresponding to the extracted frame based on 
the first information; 

computer readable program code means for 
"causii^" a~cbnfpu w 
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time of the extracted frame based on the sec- 
ond Information; and 

computer readable program code means for 
causing a computer to display the obtained vid- 
eo data for the determined display time. 

21. A method of describing sound information, the 
method characterized by comprising: 

describing, for a frame extracted from a plural- 
ity of sound frames in a source sound data, first 
information specifying a location of the extract- 
ed frame in the source sound data; and 
describing, for the extracted frame, second in- 
fo rmationn-elatin g4o-a-reprod uction-startti nne- 
and reproduction time of the sound data ofthe" 
— e-xtrasted-f rame f — 



55 



22. An article of manufacture comprising a computer 
usable medium storing frame information, the frame 
information characterized by comprising: 
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first information, described for r frame extract- 
ed from a plurality of sound frames, specifying 
a location of the extracted Irame in the source 
sound data; and 

second information, described for the extracted 
frame, relating to a reproduction start time and 
reproduction time of the sound data of the ex- 
tracted frame, 

23. A method of describing text information, the method 
characterized by comprising: 

describing, lor a frame extracted from a plural- 
ity of text frames In a source text data, first in- 
formation specifying a location of the extracted 
frame in the source text data; and 
describing, for the extracted frame, second in- 
formation relating to a display start time and 
display lime of the text data of the extracted 
frame. 

24 An article of manufacture comprising a computer 
* usabiernedium storingf rame information, the frame 
information characterized by comprising: 
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first information, described for a frame extract- 
ed from a plurality of text frames in a source 
text data, specifying a location of the extracted 
frame in the source text data; and 
second information, described forthe extracted so 
frame, relating to a display start time and dis- 
play time of the text data of the extracted frame. 



25 A carrier medium carrying computer readable in- 
structions for controlling the computer to carry out 
the method of any one dalroltoJO J^jQ^ 
ariSS." 
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